Cognitive modeling system including repeat processing elements and on-demand elements

ABSTRACT

The present design is directed to a system for performing cognitive modeling, including an event acquirer configured to acquire an event comprising an associated date and set of data fields, an analyzer element comprising a plurality of components repeated for each field in an event received from the event acquirer, wherein the analyzer element applies thresholds to each event, determines outliers, evaluates time-ordered behavior, and predicts threshold violations for the event, a periodic set of components configured to operate periodically on demand, the periodic set of components configured to perform peer to peer analysis, actor correlation analysis, actor behavior analysis, semantic rule analysis, and predict rates of change, and a plurality of signal managers interfacing with the analyzer element and the periodic set of components configured to exclude signals based on content properties of data transmitted.

The present application claims the benefit of U.S. Provisional PatentApplication Ser. No. 62/397,866, filed Sep. 21, 2016, inventors MichaelE. Cormier et al., entitled “Apparatus and Process for Analyzing andCategorizing the Behavior of Actors in an Environment,” and furtherclaims the benefit of U.S. Provisional Patent Application Ser. No.62/467,414, filed Mar. 6, 2017, inventors Michael E. Cormier et al.,entitled “Cognitive Modeling Apparatus and Method Including ComputeServers, Statistical Modeling, and Relevancy Engine,” the entirety ofboth of which are incorporated herein by reference.

BACKGROUND OF THE INVENTION Field of the Invention

The present invention relates generally to the field of computingsystems, and more specifically to computing systems employing cognitivemodeling.

Description of the Related Art

Humans typically think about physical behavior using qualitativeconcepts, such as “risky” or “dangerous,” instead of a precise numericalvalue, such as “there's a 67% chance X will kill you.” As a consequence,when processing data, most humans are typically interested inidentifying data items that satisfy qualitative criteria. For example,an analyst may want to identify people sending an “unusual amount ofemail” as opposed to “more than 10 emails.” In existing analysissystems, an analyst is typically unable to translate qualitativebehavior criteria into corresponding numerical values before performingan analysis. This process, if it exists at all, is both time-consumingand problematic because the numerical values for specific qualitativebehavior criteria can vary considerably among different contexts. Forexample, the definition of an “unusual number of failed logins” by acomputer user is likely to be different during a peak-usage time in themiddle of the day in comparison to a low-usage time in the middle of thenight. Also, the definition of a specific qualitative behavior criterioncan vary among different users and different contexts.

While such systems can have benefits, one drawback is the inability tochange attributes or criteria dynamically. For example, while a “shortresponse time” may have a given connotation in one set of circumstancesat a particular time within a system, the same system may warrant adifferent set of parameters when circumstances change. The question ishow can the system change or adapt depending on the parameters ofinterest, and more particularly, how can such machines be extended,tuned, and deployed.

A further issue with available designs is the ability to process andoperate on large amounts of data, such as data that appears to berelatively infinite. It can also be difficult to scale the performanceof multiple disjoint algorithms across an actor population, particularlywhen such an actor population is exceedingly large or apparentlyinfinite.

It would therefore be advantageous to provide a system that overcomesthe issues with current modeling and reasoning devices and enablesdynamic assessment and alteration of system related parameters based onchanged circumstances encountered, particularly when exceedingly largeand/or apparently infinite amounts of information are considered.

SUMMARY OF THE INVENTION

Thus according to one aspect of the present design, there is provided asystem for performing cognitive modeling, comprising an event acquirerconfigured to acquire an event comprising an associated date and set ofdata fields, an analyzer element comprising a plurality of componentsrepeated for each field in an event received from the event acquirer,wherein the analyzer element applies thresholds to each event,determines outliers, evaluates time-ordered behavior, and predictsthreshold violations for the event, a periodic set of componentsconfigured to operate periodically on demand, the periodic set ofcomponents configured to perform peer to peer analysis, actorcorrelation analysis, actor behavior analysis, semantic rule analysis,and predict rates of change, and a plurality of signal managersinterfacing with the analyzer element and the periodic set of componentsconfigured to exclude signals based on content properties of datatransmitted. The plurality of components and the periodic set ofcomponents are configured to interface with a threat detector.

According to a further aspect of the present design, there is provided asystem for performing cognitive modeling, comprising an event acquirerconfigured to acquire an event comprising an associated date and set ofdata fields, an analyzer element comprising a plurality of componentsrepeated for each field in an event received from the event acquirer, aperiodic set of components configured to operate periodically on demandto analyze and predict based on information received from the analyzerelement, and a plurality of signal managers interfacing with theanalyzer element and the periodic set of components, wherein theperiodic set of components is configured to exclude signals based oncontent properties of data transmitted. The plurality of components andthe periodic set of components are configured to interface with a threatdetector.

According to a further aspect of the present design, there is provided acognitive modeling apparatus, comprising an event acquirer, an updatingand evaluating arrangement comprising hardware configured to applythresholds, update event related data, predict thresholds and determineoutliers from events received from the event acquirer, and a periodic/ondemand apparatus configured to analyze event data on demand, and aseries of signal managers comprising a first signal manager connected tothe updating and evaluation arrangement and a second signal managerconnected to the periodic/on demand apparatus. The series of signalmanagers are configured to exclude signals based on content properties.

These and other advantages of the present invention will become apparentto those skilled in the art from the following detailed description ofthe invention and the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

For a more complete understanding of the present disclosure, referenceis now made to the following figures, wherein like reference numbersrefer to similar items throughout the figures:

FIG. 1 is a high level operational perspective of the present systemshowing the way in which a cognitive modeling model is created and howthis model is executed;

FIG. 2 is a detailed schematic roadmap of core system functions;

FIG. 3 shows a first high level view of cognitive modeling;

FIG. 4 is a further cognitive modeling high-level overview;

FIG. 5 illustrates an example of a compressed matrix;

FIG. 6 visually depicts a week-by-week comparison; and

FIG. 7 is a high level functional overview of a relevancy engineconnecting rules with a weighted relevancy graph.

DETAILED DESCRIPTION

The following description and the drawings illustrate specificembodiments sufficiently to enable those skilled in the art to practicethe system and method described. Other embodiments may incorporatestructural, logical, process and other changes. Examples merely typifypossible variations. Individual elements and functions are generallyoptional unless explicitly required, and the sequence of operations mayvary. Portions and features of some embodiments may be included in, orsubstituted for, those of others.

Modern data centers often comprise thousands of host computer systemsthat operate collectively to service requests from even larger numbersof remote clients. During operation, these data centers generatesignificant volumes of performance data and diagnostic information thatcan be analyzed to quickly diagnose performance problems. In order toreduce the size of this performance data, the data is typicallypre-processed prior to being stored based on anticipated data-analysisneeds. For example, pre-specified data items can be extracted from theperformance data and stored in a database to facilitate efficientretrieval and analysis at search time. However, the rest of theperformance data is not saved and is essentially discarded duringpre-processing. As storage capacity becomes progressively cheaper andmore plentiful, there are fewer incentives to discard this performancedata and many reasons to keep it.

This plentiful storage capacity is presently making it feasible to storemassive quantities of minimally processed performance data at “ingestiontime” for later retrieval and analysis at “search time.” Performing theanalysis operations at search time provides greater flexibility becauseit enables an analyst to search all of the performance data rather thansearching pre-specified data items stored at ingestion time. The analystmay then investigate different aspects of the performance data insteadof being confined to the pre-specified set of data items that wereselected at ingestion time.

However, analyzing massive quantities of heterogeneous performance dataat search time can be a challenging task, particularly in a dynamic andconstantly changing environment, and particularly when the amount ofdata is excessively large and seemingly infinite. A data center maygenerate heterogeneous performance data from thousands of differentcomponents, which can collectively generate tremendous volumes ofperformance data that can be time-consuming to analyze. For example,this performance data can include data from system logs, network packetdata, sensor data, and data generated by various applications. Also, theunstructured nature of much of this performance data can pose additionalchallenges because of the difficulty of applying semantic meaning tounstructured data, and the difficulty of indexing and queryingunstructured data using traditional database systems.

Definitions

For purposes of understanding the present design, a set of definitionsis provided. Certain terms are provided with initial capital letters butin certain instances initial letters have not been capitalized,including in the claims provided herewith. The interpretation ofterminology is intended broadly and may or may not specifically includethe defined terms below depending on use.

An Actor may be a person, a device, a transaction, a service, a process,or any other entity that is a core or critical component to a businessenvironment.

An Actor performs Actions. An Action is an event that is represented bya Taxonomy.

Each specific value of the Taxonomy is referred to as a Terrain.

A collection of Terrains involving the same Actor is called a Landscape.

The definition of a Taxonomy is stored in a Data Dictionary.

A Cognitive Model contains the representation of the behaviors of acollection of Actors over a period of time. This knowledge isrepresented in many forms, including, but not limited to, Landscapes,Terrains, Terrain Cells, Sequence Graphs, Relational Graphs, Fuzzy Rulesand Fuzzy Clusters. Cognitive Models store their information in theappropriate data store, depending on the form of the data, as describedabove.

The information initially stored in a Cognitive Model is determined bywhich Analyses are relevant to the Taxonomy. This list is stored in theassociated Data Dictionary entry for the Taxonomy.

Cognitive Models, amongst other functions, provide the informationnecessary to perform the initial Analysis of Actor Behavior.

Cognitive Models may represent their “trustworthiness” (Trust) as afuzzy number between 0 and 1. This Trust is calculated by analyzing thedensity (amount) of information over the expected density (amount) ofinformation for each Taxonomy. This value is dynamic and will changeover time. The higher the Trust, the more accurate (trusted) the resultsof Anomaly Analyses run against the Cognitive Model.

An Anomaly Analysis is an algorithm that ingests information from aCognitive Model and generates representations of anomalous ActorBehavior. These anomaly analyses, include, but are not limited to:Threshold, Sequence, Weekly Actor, Week-over-Week Actor, Rule and PeerAnalysis.

The anomalies are generated in the form of Cognitive Signals.

A Cognitive Signal is a language construct that represents some form ofpersistent knowledge. Cognitive Signals can be used as input and/oroutput of any analysis, algorithm or other component that ingests orgenerates persistent knowledge. In this system, Cognitive Signalsrepresent persistent knowledge about Actor Behavior.

Cognitive Signals are the input for Hazard Analysis and Threat Analysis.

An Environment is a defined place where Actors perform Actions.

The System is defined as the collection of all components, data stores,algorithms and any executable necessary for the Process to function.

A Resource Mapping describes the relationship of external data fields todata fields in the System. The following is a list of mapping types(more or less may be provided):

COPY-VALUE

SPLIT-VALUE-MULTI-FIELDS

VALUE-MAPPING

DATE-STRING-TO-EPOCH

REGEX-CAPTURE-GROUP

SEARCH-AND-REPLACE

TOKENIZE-AND-EXTRACT

A Graph is a collection of nodes (activities) and edges (links betweenactivities) which is initially build with historical data to define anActor's normal sequence of events on a periodic basis. In the presentdiscussion, there is an individual graph by Actor and day of week toallow the System. Tracking activities by day of the week can besignificant because what an Actor does on a Monday can differ greatlyfrom what the Actor does on Friday, Saturday, or Sunday (if anything).

A Node is an activity, described by a Taxonomy, performed by an Actorthat is tracked in a graph. Examples of nodes are things like: log intoVPN′, ‘send an email’, ‘access a web site’, log off VPN′, etc.

An Edge connects transitions between nodes and maintains a vector oftimes that the transition occurs, frequency count, and probabilities oftransitions between nodes. An example of an Edge is a connection between‘login to VPN’ and ‘send email’ or ‘login to VPN’ and ‘access a website’. Each time a transition between two nodes occurs, the edge isupdated with the time of the transition, and the frequency count andprobability of the transition is updated.

An Anomaly is behavior that is inconsistent with normal behavior. AnAnomaly has two important properties: severity and degree ofinconsistency (or mathematical distance) from normal behavior. Theseverity is a class of Anomaly assigned by the analytical component thatdetected and measured the Anomaly. The degree of inconsistency is afuzzy number.

A Hazard is an unperfected Threat associated with an Actor withoutregard to any related assets or the behavior of other Actors. A Hazardrepresents the risk to an enterprise based solely on cumulativemulti-dimensional behaviors (that is anomalous states generated fromthresholds, orders of operation (AOO), peer-to-peer similarity, and theActor's behavior change over time. A Hazard has two particularproperties: severity and weighted risk. A Hazard has a severity that isnot assigned, but is derived from the collection of inherent terrainrisks.

A Threat ties together Actors with the behaviors of other Actors as wellas the Assets used by all the Actors in the Hazard collection. Threatsdevelop a sequence of operations over a dynamic time frame. Threats areconnected in a heterogeneous Graph where Nodes can be Actors or Assetsand the Edges define the frame as well as the strength of theirconnection as a function of risk.

Criticality, used in Threat Detection, is an importance value (e.g.1-100) assigned to a resource such as an Actor or Asset. The higher thevalue assigned, the more important or critical the resource. Forexample, a server where sensitive enterprise information resides (suchas credit card numbers) would have a much higher criticality (i.e. 100)than a server used internally for software testing (i.e. 20). Likewise,an Actor with access to sensitive information like a CFO would have ahigher criticality (i.e. 100) than an administrative assistant (i.e.30).

Actor Criticality: Each Actor is assigned a criticality value based onhow much importance should be assigned to the Actor in the ThreatDetection process.

Asset Criticality: Each Asset is assigned a criticality value based onhow much importance should be assigned to the Asset in the ThreatDetection process.

Actor Threat: The Threat Detection process generates an Actor threat forcases when the System determines that an Actor's actions during thethreat window time period represents a threat to the enterprise.

Asset Threat: The Threat Detection process generates an Asset threat forcases when the System determines that an Asset associated with one ormore Actor threats during the threat window time period is at risk.

Compatibility Index (CIX): The measurement of the fuzzy number thatrepresents how compatible a value is with a Concept.

Cognitive Genetic Algorithm is a method for solving both constrained andunconstrained optimization problems based on a natural selection processthat mimics biological evolution utilizing cognitive processes.

Cognitive Evolutionary Strategy is the method for making decisionsutilizing Cognitive Genetic Algorithms.

Concept: A qualitative semantic term represented as a fuzzy set.

Context: A collection of Concepts that are mapped to a field or value.

The present design is a cognitive modeling platform providing anadvanced fuzzy logic based plug-and-play architecture. Using thisarchitecture, machine intelligence applications can be designed,implemented, extended, tuned, and deployed. The machine reasoning systemprovides a unique and powerful machine reasoning system that combinesqualitative semantics, linguistic variables, fuzzy logic based versionsof machine learning and machine intelligence algorithms, multi-objectiveand multi-constraint genetic strategy functions, as well as a nearnatural language rule and explanatory capability.

Two components of note in the present design are the Cognitive Modeler(CM) platform and the cognitive modeling suite, component-based systemarchitectures designed to incorporate and extend machine learning,general machine intelligence, computer science, and lines of businessanalytical functionality. Based on the twin ideas of qualitativesemantics (fuzzy logic) and a cognitive signal's “glue,” productfunctionality can be added, extended, removed, or modified. Thefunctional capabilities are designed in terms of “cognitive signalaware” components, enabling straightforward insertion or removal ofcomponents, since their only public interface is via Cognitive Signals(aside from the globally underlying Taxonomies in the over-all systemecosystems).

Conventional methods of incorporating machine learning, planning, andprediction have centered on three principal techniques. Statisticalinference systems encompass not only traditional statistical analysisapproaches (such as those supported by SAS and SPSS), but also machinelearning capabilities based on clustering, Bayesian probabilities,decision trees, as well as partitioning and classification. Expertsystems are if-then-else rule driven applications that apply an“inference engine” to evaluate rules and follow a path to a particularsolution. The rules can be developed by subject matter experts or aspart of a statistical analysis (in a manner similar to statisticalinterference systems or through the process of data mining. The thirdtechnique is the use of neural networks, representing layeredconnectionist machines (either in software or hardware) that mimic theway neurons in the human brain supposedly collect data, aggregaterelated data elements, prioritize important data relationships, andlearn patterns. Many powerful machine intelligence systems, such asIBM's Watson, are based on neural networks.

These approaches are not mutually exclusive. Rules are often combinedwith or generated from neural networks. Neural networks and rules areoften generated by the statistical analysis of data patterns in aprocess known as “data mining”

However, none of the existing methods of machine learning and machineintelligence incorporate qualitative expression and semantic reasoningimplemented through use of fuzzy logic. These techniques lack theability to consider “shades of grey,” reason with conflictinginformation, reason at a conceptual level as a matter of firstprinciples, and deal with the nature of data in way that reducescognitive dissonance (that is, in a way that allows the machine toimitate the natural way humans understand patterns in data).

The term “fuzzy logic” has been used in a broad, loose, and impreciseway among the population to mean any system that deals withuncertainties. This, however, is not the real epistemological foundationof fuzzy logic. Fuzzy logic deals with informational ambiguity and dealsat the conceptual level rather than the data level. Ability to reasonabout concepts is a first order capability of fuzzy logic. Reasoning atthe concept level is not possible in any other logic or algebraicsystem.

The present design includes a cognitive modeling component architectureas well as a description of the operational (functional) processes inthe architecture. The cognitive modeling component is constructed basedon a set of related concepts. The principal concept is an Actor, andfunctions of the system are organized around a collection of Actors. AnActor causes a change to the state of the underlying model. A cognitivemodel learns the time-varying behavior of an Actor. The behavior of asingle Actor may be influenced by the behavior of one or more otherActors (of various types). Information and knowledge in the cognitivemodeling component are organized around Actors according to thefollowing functional and operational architecture. An Actor can be aperson (a human being), a machine (a device of some kind), a thing (avehicle, toaster, meter, sensor, a document, etc.), an automaton(independent intelligent cellular automata), a service (such as anetwork), or a composite (some combination of Actor types). Theproperties (characteristics) of an Actor and the properties of an eventare determined by a set of application specific Common Information Model(CIM) objects. cognitive modeling CIMs can be fused with client datamodels. Associated with an Actor is a collection of Terrains and acollection graphs (such as the day-of-week Markov graphs). Terrains andgraphs are created from incoming event data. Event data is organized ina set of Taxonomies.

A Taxonomy is a tree-like structure that organizes data according to aprecise genotype. A genotype defines the category (or name) of theinformation at each level in the tree. For a person the genotype meansthe Actor, an Action, a process, and an event data value (a field orelement). The actual values in a Taxonomy constitute the phenotype(which describes the external or instantiated characteristics of theTaxonomy).

Different Actor types have (or can have) different Taxonomic genotypes.Data elements can be shared, via their Taxonomies, across Actors. ATerrain is associated with a data field (either contained in an event orcomputed from other event or other computed data fields). The Terrain isused to learn the time-varying behavior of an Actor relative to the datafield. Terrains are d×t×v (day of year (d), time of day (t), averageweighted value (v)) structures that evolve over time to reflect, asnoted, the behavior of an Actor for that data field. Data value learningand analysis in cognitive modeling is done primarily at the Terrainlevel.

A collection of all the Terrains and graphs for an Actor is called thelandscape. A graph is a stationary or time-oriented, connectionistarchitecture that discovers and connects the Actors to Actions, assets,and any other virtual or physical object or concept. Graphs can beMarkov or homogeneous or heterogeneous or hyper-graphs. The Order ofOperations Markov graphs, as an example, learns the behavior of an Actorfor any pre-defined periodicity, e.g. a particular day of the week. Theedge between any two Actions (say Ai and Aj) contains the probabilitythat an Actor performs Action Ai and then performs Aj within someaverage interval ( ) of separating time, T. A collection of all thegraphs for an Actor is called the topography network (or simply, thenetwork) and is part of the Actor's landscape. A named collection ofActors (with their landscapes) is an ecology (or ecosystem) but isinformally called a model.

In general, with the foregoing definitions and explanations, the processflow of the design is generally as follows. First, the System providesor executes an Actor/Asset process flow. Such an Actor/Asset processflow may be the first two steps in the overall Process Flow.

A Data Dictionary is created for each Taxonomy. This includes thedefinition/mapping of the Actor, Action, Process, Field, Value, SourceAsset, Destination Asset and Time fields from the data source to theTaxonomy. In addition, the System builds the list of valid AnomalyAnalyses.

The System takes in Actors and their associated information. To take inor update Actor information in the System, the System employs a ResourceMapping. The user may create an appropriate Resource Mapping for theActor data available. The System processes the Resource Mapping to takein this information. Resource Mappings may be run from time to time toreload and/or update Actor information.

Assets and their associated information are taken into the System. Totake in or update Asset information in the System, the System uses aResource Mapping. Again, the user may create an appropriate ResourceMapping for the Asset data available. The System processes the ResourceMapping to ingest this information. Resource Mappings may be run fromtime to time to reload and/or update Actor information.

Once the Data Dictionary entries are created, the System creates atleast one Cognitive Model. The first step is to determine which DataDictionary entries will be needed by the Cognitive Model.

Next, the System loads historical data from the data source for eachData Dictionary entry. A time period is selected for this historicaldata.

From the ingestion of historical data, the Cognitive Model is generated.The trust of the Cognitive Model is calculated, as a fuzzy number.

If the Cognitive Model trust is sufficiently high, the Cognitive Modelis activated. Once activated, the Cognitive Model schedules a collectionof tasks to run that perform regular extraction of actions (events) fromthe original data source, as well as Anomaly Analyses associated withthe Cognitive Model. These tasks may also be run in an On-Demand(ad-hoc) fashion.

The following is the process of an activated Cognitive Model.

For each Data Dictionary entry selected in the Cognitive Model, thecorresponding (or calculated) Actor Actions are extracted and then“normalized”. In addition to the extraction of an event, Actor Actionsmay be calculated based information stored in the data source. An ActorAction is “normalized” by converting the original derived (orcalculated) event to the specific Data Dictionary format (Taxonomy).

Each normalized Terrain entry is inserted into the Cognitive Model. Alist is kept in each entry the Data Dictionary as to which components ofthe Cognitive Model are to be updated upon Terrain ingestion. Allinternal structures of the Cognitive Model that need to be updated arethen updated.

The process that keeps Models updated repeats at standard intervals.

Anomaly Analyses then runs against Models in a scheduled fashion. Theseanalyses may also be run in an On-Demand fashion. Each analysis runsindependent of any other Anomaly Analysis.

Each analysis will extract the relevant information from a CognitiveModel for each Actor over a specified time period. The results of ananalysis run are written to the data store in the form of a CognitiveSignal, stating, amongst other things, the Anomaly, the severityintensity as a fuzzy number, the Taxonomy information (see the TaxonomyDefinition above), the associated event(s) and all other relevantknowledge necessary to validate the Actor Anomaly.

Independent of Anomaly Analyses, the System runs Hazard and ThreatAnalyses in a scheduled fashion. These analyses may also be run in anOn-Demand fashion. These analyses ingest Cognitive Signals generatedover a specific time period, and, in combination with Actor (and Asset)Criticality, determine whether any Actor Behavior is considered to be aThreat. These analyses will output this knowledge in the form of aCognitive Signal, stating, amongst other things, the Actor, Assets,severity intensity as a fuzzy number, the associated CognitiveSignal(s), and all other relevant knowledge necessary to validate theActor Hazard or Actor Threat. A high level operational perspective ofthe present system is shown in FIG. 1, showing the way in which acognitive modeling model is created and how this model is executed. FIG.1 is a simplified version for teaching purposes. A cognitive modelinginstallation involves the creation of an aggregating model. A set oftransient models is created on many servers. The computational serversreturn the transient model to the aggregating model which creates thecurrent run-time model. This run-time model is then sent back to thecompute server. The present description does not get into this level ofdetail but such a level of detail is contemplated and forms part of thedesign.

FIG. 1 illustrates cognitive modeling operation. The System creates acognitive modeling model with the xvCreateModel command at point 101.This command uses event history 111 to discover the normal behavior ofeach Actor (that is, it populates all of the Terrains and builds theday-of-week Markov graphs for each Actor). The xvCreateModel commandcreates an empty cognitive modeling model. While the System can specifya source of history data with the same command, xvCreateModel name=name,alternately data=<search specs> can be used to specify where to find therepository of history data. Point 102 represents the Persistent Run-TimeModel. The persistent model is the disk resident version of the System.Persistent Run-Time Model 102 contains the overall ecosystem. Theecosystem contains a set of Actors. For each Actor the System stores theActor landscape. A landscape contains the Terrains and the Markovgraphs.

During model generation, the system uses a set of application specificCommon Data (Information) Models 113 to find the properties of Actors aswell as the structure and data contents of events. A client specifiedconfiguration file 114 guides the ways in which the physical model iscreated and installed.

The Persistent Run-Time Model 102 also contains two files updated by thePersistent Run-Time Model 102. These are run-time properties 115. Therun-time properties file system is maintained in several states. Thefirst state is the version that corresponds to the current version ofthe model (which is the date it was last saved). The previous N versionsof the model may also be saved. The second state is exploratory. Anexploratory properties file is used in the genetic configuration andoptimization processes. These functions create a large collection ofpossible run-time properties (each is a “genome” in a Cognitive GeneticAlgorithm) and then seek out the set of properties that maximizes theperformance (sensitivity) of the model.

The adaptive controls 116 contain all the parameters and properties usedin the machine learning algorithms. These are used by the CognitiveEvolutionary Strategy mechanisms (Cognitive Genetic Algorithms) tooptimize the System configuration and the machine learning functions.The adaptive controls form a genome in the genetic tuning system. Thegenetic algorithm then explores thousands of possible genomes withrandom variations to find the optimal solution to the learning systemconfigurations.

This phase also creates a series of profile statistics for the model aswell as for each Actor and its landscape. From these statistics, theclient can decide whether or not to accept or re-generate the model(using more or additional data).

Run Model 103 is shown in FIG. 1. The cognitive modeling model startswith the xvRunModel command which provides the name of model to startand the request (search) commands that find and supply the events. TheSystem can start or re-start a slave (remote level) model using thexvStartRemoteModel command. When the System starts a model running, ittypically starts with two sets of asynchronous components. The firstcomponent set processes each field in each incoming event. This firstcomponent checks the data against a set of thresholds and then comparesthe event against a graph of allowed Action to Action transitions. Thesecond component set contains analyses that run when scheduled. Thesefunctions involve advanced (and often time consuming) analyses amongmany Actors or against the history of behavior of an Actor (comparedagainst many other Actors).

Signal Generation element 104 is also shown. Cognitive Modelingcommunicates between its analytical components as well as with theoutside world through a set of signals. Signals tie the internal machinelearning functions together and also tie the cognitive modelingapplication to client interfaces, and other applications.

Various machine learning and analytical functions underlie a cognitivemodeling application. While this architecture is common to all thecognitive modeling applications, some applications may add additionalfunctions and services to the base architecture.

FIG. 2 is a detailed schematic roadmap of core System functions. FIG. 2and this description represent a rather high level perspective butaddress all the current principal components and show how they areconnected to each other. Boxes in dashed lines indicate an internal,support function for which no explicit cognitive modeling commandexists. Boxes 250 a-e represent the combined signal manager and theadaptive feedback system (the ability to exclude signals based on anynumber of content properties). Boxes with multiple human-type figuresindicate a process that analyzes sets of Actors (instead of individualActors). Arrows show the general operational flow.

The lock-and-key symbol means that the functions to the right arecurrently activated by a set of gate-keeping rules which determine whenthey are executed. That is, the System does not usually run apeer-to-peer analysis every time a new event is sent into theapplication.

The FIG. 2 design suggests or represents a Minimum Viable Product (MVP)version of cognitive modeling (as a line of products). The MVP systemincorporates the core functionality used to detect anomalies (andseparate threats from the population of anomalous but not dangerousstates).

Note that many machine learning algorithms are very complex and involvesubtle time-varying states and state transitions. This documentdescribes, in more or less general terms, what a function contributes tothe overall application capabilities.

Element 211 represents Acquire Events. Cognitive modeling begins withthe acquisition of events. These events are collected through thexvAcquireEvents request (or through the xvRunModel command whichessentially passes its event matching request to the underlyingxvAcquireEvents command). An event contains an associated date as wellas a set of data fields. These fields are defined in an associated datamodel (see element 212).

Element 212 represents Sending each Data Element. Using the associateddata model (the Common Information Model) the System decomposes an eventinto a set of its data elements (or fields). The set of N data fields inan event are represented as a three-tuple: (time,field-name,value)(either physically or logically, meaning that the representation of atuple does not materially the functionality of the System as long as weare dealing with, at a minimum, a three-tuple containing the requiredmembers). These tuples are sent into the cognitive modeling environmentone at a time. When the complete set of data fields has been sent, theSystem progresses to the next physical or logical event.

Element 250 e is a signal filter which interfaces with signal weightsrepository 214 and provides filtered signal information to display 213in this arrangement. Signal filter 250 e interfaces with signal managers250 c and 250 d.

Element 201 is the Apply Thresholds element. Thresholds detect anomalousdata values in events or cumulative anomalous data values on a Terrain.As a data element approaches the threshold violation value, thethreshold analysis component begins to emit signals (violation proximityand eventually violation triggered conditions). Very large data values(values that fall outside the domain (or allowable range) of theTerrain) trigger an outlier signal.

Thresholds in cognitive modeling are modeled as fuzzy sets. The use offuzzy logic allows a measured degree of such conditions as nearness to athreshold, hard and soft violations, and the degree of a violation.Thresholds are dynamically created from the statistical properties ofthe data using complex data aggregation and discrimination techniques.There are several different kinds of thresholds. Each Terrain (datafield) has its own set of thresholds. A Terrain can simultaneously havemultiple thresholds across different parts of its surface (that is, asone example, for different time periods).

A rate and degree of change threshold (RDCT) is a trigger. By default,the trigger sits well above (and, in some cases, also below) thedistribution mean and some multiple of the standard deviation of thedata. But a trigger threshold is not a single value. A threshold is afuzzy sigmoid space so that the idea of a threshold violation increasesrather quickly as a data point approaches the actual threshold.

Outliers are detected by the outlier boundary threshold (OBT), a narrowsemi-permeable fuzzy tensor that sits well above (and possibly below)the violation point of a rate of change threshold. An outlier is anevent data element or the cumulative value of a Terrain cell that iseither four or more standard deviations from the distribution mean or anorder of magnitude larger (as measured by the log² of the value) thanthe allowed range of values (that is, it falls outside the domain of theTerrain).

Event data elements that exceed the OBT are not used to update theTerrain. Cumulative data values on the Terrain that exceed the OBT causethe model to either cease updating the Terrain at that day and timeregion or to put the model into discovery and training mode for thatTerrain.

Data elements (either from events or from cumulative Terrain values)that fall outside the outlier boundary threshold are sent to the OutlierAnalysis element 202. The Outlier Analysis element 202 determineswhether or not the frequency, periodicity, and general value of outliersimplies the start of either another data pattern or a change to anexisting data pattern. As an example, if an Actor is a project managerwho takes on additional large project, then we might expect aconsiderable increase in email frequencies and size as well as to andfrom addresses. It is possible that these values exceed the currentcomputed range of the current Terrains. If a pattern is detected, theSystem uses an adaptive reinforcement learning technique to graduallyincrease the rate and degree of change threshold as well as the outlierboundary threshold.

As the System continually updates a Terrain, it is important todetermine whether or not the time-varying changes in the local Terrainregion describe a consistent growth (or decay) in the average value insuch a way that, if the trend continues, it will lead to a violation ofthe rate and degree of change threshold. Predict Threshold Violationselement 202 uses a third-order polynomial (nonlinear) regression to makethe prediction. This third-order polynomial works well on both linear aswell as most non-erratic nonlinear data.

The System employs Evaluate Time-Ordered Behavior element 204 using amechanism known as a Markov Graph, to learn the general sequence ofevents performed by an Actor during each day of the week. On day of theweek, D_(w), the graph connects Taxonomic Action A_(i) with TaxonomicAction A_(j) by observing that A_(j) follows (or precedes) A_(i) withsome probability P(k).

Thus when the System receives an event and its data fields, it storesActions (A_(i . . . k)) in a LIFO queue. When the depth of the Actionqueue is greater than one, the System can check if each Action pair onthe queue (A₁, A₂, . . . , A_(k)) lies along an edge of the Markov graphfor the current day of week. The associated probability of A_(i)→A_(j)indicates the degree to which the sequence is normal for this Actor onthis day or week. Probabilities in this context are treated as fuzzynumbers (that is, they have a bell curve around their value and a degreeof expectancy). When the association is not normal (to some degree) theSystem may emit an anomalous sequence signal.

Actor sequence processing may be as follows. Actor sequence processingand analysis looks at an action performed by an Actor (i.e. login toVPN) and determines if the action is something the Actor normally doesrelative to a previous activity. If the action is out of the ordinary,the System generates an anomaly signal.

Sequence analysis occurs after the historical graphs for each Actor arebuilt in order to understand of the Actor's historical behavioralpatterns. Processing operates as follows:

for each action performed since the last time the actor's analysis ran:

retrieve the prior activity performed

if the prior activity was performed on a day or more earlier that thecurrent activity

the prior activity is considered to be the first event of the day.Otherwise, a special Node called ‘firstEventOfDay’ is considered to bethe prior Node.

else

-   -   the prior activity is unmodified

Retrieve the edge connecting the prior activity with the currentactivity

if the edge does not exist

-   -   this is the first time the new activity has been performed,        generate an unusual activity signal and process the next        activity.

Retrieve the total frequency count of all Edges that transition awayfrom the prior activity.

derive the strength of the Edge connecting the prior activity with thecurrent activity:strength=(currentActivityFrequenceyCount/priorActivityFrequencyCount)which is a probability value between 0 and 1.

Add the current transition probability to a cumulative transitionprobably which tracks the average strength of all transitions in thecurrent day.

If the derived probability is near 0 (for example, between 0 and0.02—very rare transition)

-   -   generate a rare Actor sequence signal        -   else if the derived probability is less than or equal to            minimum allow edge strength (0.05)    -   generate an uncommon Actor sequence anomaly signal (rare        transition) else    -   The transition is valid

if the cumulative Edge strength is less than minimum allowed cumulativeEdge strength

-   -   emit low average Edge strength anomaly indicating the strength        of the Actors transitions for the day is lower than expected        Persist changes made to the Graph to allow subsequent runs to        pick up where this analysis left off.

The Markov graph cannot indicate or provide fundamental prerequisiteActions. That is, the System is completely data driven. The Systemcannot learn that one must authenticate oneself to a network before oneuses that network (to send emails, as an example). Authenticationregularly precedes other Actions can be learned (thus there is a graphedge authenticate→sendEmail). Hence if one sends an email beforeauthenticating the graph, system analysis would detect an anomaly.

Peer to peer analysis element 205, also known as Actor behavior analysiselement, detects anomalous behaviors among Actors (mostly Actors) whoare work-specific peers—that is, they are assumed to share the samekinds of tasks. During peer analysis the System computes the similaritybetween shared Terrains among the peers. The degrees of dissimilarity(based on fuzzy similarity functions) between Actors reflects the degreeof anomalous behaviors within some subset of Actors. The stronger thepeer grouping the more reliable the analysis. Strength is measured bytwo factors: how much the Actors are actually peers and how much historyexists (the density (trustworthiness) of the Terrains). As part of thisanalysis the System computes two particular metrics: trust and cohesionindex.

Peer to Peer Analysis generally operates as follows. Peer to PeerAnalysis is designed to detect anomalous behavior amongst a group ofActors considered to be peers. The Actor attributes that define the peergroup are specified by the caller and can be any of the attributesdefined on the actor object (which is extendable). Examples ofattributes that can be used to group peers may include but is notlimited to:

actorType (person, machine, device, etc)

gender (person only)

businessUnit

title

managedBy

category

geoLocation bounding box

region

class

tag

The peer to peer processing works as follows:

1. Identify Actors that are part of the peer group as specified oncommand line: Currently limited to an Actor's: actorType, business unit,category, managedBy, etc. Actors in cache are first identified asmatching criteria, then the System examines the model and every Actorwith a corresponding landscape on disk is placed into an ecosystem. Ifmodels for a selected Actor do not exist, such Actors are excluded fromanalysis (since there is no data to compare).

2. One by one, the System compares each Actor's Terrains to all otherActors in the peer group and cache results. Since the System iscomparing all peers, the comparison for each Actor combination onlyhappens once. For example, for three actors (harry, earl, and joe), onlythree comparisons occur:

harry->earl, harry->joe, then earl->joe.

For each Terrain compared, the System aggregates the Terrain's data intoa compressed terrain matrix that is cached for later use. The shape ofthe matrix and data contained is controlled by the properties below inbold. The other properties included are used to tweak the behavior ofthe Peer comparison algorithm.

xv.peeranalysis.lookbackWindowDays=30

xv.peeranalysis.daysInterval=3

xv.peer.analysis.periodInterval=16

The properties above indicate that 30 days of data may be aggregatedinto a 10×6 matrix. Each cell contains three days of data in a four hourtime window where x is days and y is time period.

3. After all comparisons are complete, the System performs anomalydetection. The System emits signals during the anomaly detection phase.These properties are used to control the Peer Comparison anomalydetection processing:

xv.peer.analysis.faultyPeergroupAlphacut=0.80

xv.peer.analysis.anomalousActorAlphacut=0.85

xv.peeranalysis.countZeroSimilarityCells=false

xv.peer.analysis.genereate.virtual.terrains=true

# Empty non virtual terrain diagnostic will be added if terrain eventcount is less that this value.

xv.peer.analysis.empty.terrain.count.threshold=10

# Low terrain similarity diagnostic will be added if terrain similarityis less that this value.

xv.peer.analysis.low.terrain.similarity.threshold=0.70

# Percent threshold that determines who receives outlier terraindiagnostic for non-core terrains.

# If the number of actors with the terrain is above this percent actorsthat don't have the # terrain will receive the diagnostic. If the numberof actors with the terrain below this percent # the actors that have theterrain will receive the diagnostic. Goal is to send the diagnostic to #the minority of users

xv.peer.analysis.outlier.terrain.actor.count.percent=0.50

Anomaly detection/signal generation processing operates as follows:

1. Output Landscape similarity signals:

-   -   a. Output raw compressed terrain data signal for each actor        terrain that was included in the p2p comparison: actorId,        terrainId, matrix of data counts (cell by cell)    -   b. Output actor terrain similarity results—actor1Id, actor2Id,        terrainId, matrix of similarity results (cell by cell)    -   c. Output landscape similarity results—actor1Id, actor2Id,        landscape similarity value.

2. Actors are then ranked against their peers and actor ranking signalemitted, anomalous actors are also identified based on ranking. Actorranking works as follows:

-   -   a. Calculate the peer group mean similarity which is the average        actor landscape similarity value across all peer comparisons        that were performed.    -   b. Determine if the peer group is ‘faulty’ which means the        average similarity across all peers is less than the faulty peer        group alpha cut (0.80)    -   c. For each Actor, derive his average similarity to other peers        then divide through by the peer group mean similarity to drive        Actor ranking. If Actor ranking is below alpha cut (0.85) then        Actor is considered anomalous due to low similarity with peer        group. The System may store and emit a diagnostic message as        part of an anomalous Actor signal at a later time.    -   d. Emit actor ranking signal listing the ranking of each actor        as calculated above.

3. Generate P2P Similarity Matrix signal—lists actors on X and Y axisand similarity result for each comparison. Results provide this data:

atrisler sasha sstansbury blinebaugh capela cindy cyrus pepper pwrchuterfrie atrisler 1.00000 0.90834 0.90145 0.78997 0.75336 0.79079 0.789290.79361 0.62779 0.76421 blinebaugh 0.90834 1.00000 0.90297 0.800290.76248 0.79781 0.80016 0.79398 0.63443 0.76102 capela 0.90145 0.902971.00000 0.78336 0.75750 0.79612 0.77541 0.79863 0.62909 0.77536 cindy0.78997 0.80029 0.78336 1.00000 0.76111 0.79923 0.79993 0.78824 0.626850.76080 cyrus 0.75336 0.76248 0.75750 0.76111 1.00000 0.75817 0.730520.75831 0.59904 0.75207 pepper 0.79079 0.79781 0.79612 0.79923 0.758171.00000 0.80094 0.80330 0.62876 0.77367 pwrchute 0.78929 0.80016 0.775410.79993 0.73052 0.80094 1.00000 0.78374 0.62041 0.77128 rfrie 0.793610.79398 0.79863 0.78824 0.75831 0.80330 0.78374 1.00000 0.62102 0.76405sasha 0.62779 0.63443 0.62909 0.62685 0.59904 0.62876 0.62041 0.621021.00000 0.60370 sstansbury 0.76421 0.76102 0.77536 0.76080 0.752070.77367 0.77128 0.76405

4. Outlier Terrains are identified and signals emitted for each. Anoutlier Terrain is one owned by a minority of Actors in the peer group,identify anomalous Actors based on outlier Terrain ownership. An outlierTerrain is one owned by, e.g., 50% or less of Actors in the peer group(this value is configurable). Actors that have the outlier Terrain areflagged as anomalous.

5. For each anomalous Actor identified above, rank Terrains and emitTerrain ranking signals. Such ranking helps identify those Terrains inActors with anomalous behavior that could be of interest. Terrainranking is calculated by obtaining the mean similarity for each Terraincompared and then dividing the Actors mean similarity for the Terrainagainst the peer groups. If the Actors mean similarity versus peer groupmean is less than the alpha cut (e.g. 0.85) then the Actors Terrain isconsidered anomalous.

6. Emit anomalous Actor signals with details as to which Actor wasidentified as anomalous.

7. Emit P2P Analysis profile signal—Provides a summary of the analysisincluding criteria used to identify the peer group, trust index, andoverall results.

In peer analysis, it can be beneficial to cluster together a set ofActors that generally do the same kinds of work. The more specific theclustering the more reliable (more trustworthy) the peer analysis. Butthe more robust the clustering, the more revealing the peer analysis.That is, if the cluster is so specific that it only includes a fewActors, it might be reliable but may not reveal larger Actor trends inthe Environment. If the clustering is less specific so that it includesa large number of Actors, it might reveal what appears to be largerActor trends, but these trends might not be reliable. Hence clusteringActors on their business unit (sales, IT, Marketing, customer support,engineering, and so forth) yields a meaningless grouping because thereare so many different jobs. Clustering on business unit and age wouldalso be meaningless because the business unit is general and age doesnot qualify the business unit into jobs that are similar. The presentSystem, as discussed below, employs Actor Correlation Analysis for amethod of selecting peer groupings based on the inherent behaviorcorrelation between Actors sharing the same external properties (age,job title, authorities, demographics, geographics, work periods, and soforth).

Actor Correlation Analysis applies Pearson's r correlation to all theTerrains similarities over L time periods over all the Actors (Actors).The System then organizes the A×T (Actors×Terrain similarities) byclustering on the absolute value of r² to identify the centers ofgravity of Actor groups having a highest degree of similarity(regardless of their underlying peer properties). When simply attemptingto discover correlations, whether the correlation is positive ornegative does not matter in the initial analysis. Thus r²=+0.8 andr²=−0.8 both indicate a relatively high degree of correlation. Thedirection of correlation is a matter of interpretation. The nature offuzzy clustering facilitates developing a set of outward frontiers foreach cluster to conceptually match Actors based on the population ofshared Terrains at a specific contextual (conceptual) level ofsimilarity

Using the Actor Behavior Analysis component 206 the System examines thechange in an Actor's behavior over time by comparing the similarity ofpast behavior (the history) and current behavior. Two flavors ofbehavior change the analysis: aggregate Terrain segmentation andrepeating day of the week. In aggregate Terrain analysis the Systemdecides how much history to use (this is L, the run length of theanalysis in terms of M). The System then partitions a Terrain in M×Nrectangular segments (where M equals the number or prior days of theyear (an even division of L) that constitute a segment, and N equals therange of the time of day). The System then computes the over-allsimilarity of an Actor's behavior for each pair of segments. The Systemthen computes the degree and rates of change in behavior over all thesegments, indicating whether an Actor's behavior is subtly (orobviously) changing over time.

In the repeating day of week analysis, the System also determines howfar back in history to use (this is L, the run length of the analysis inweeks). Instead of aggregating a Terrain into clusters of M×N timeperiods, specify D×N, where D is the day of the week (Sunday=0,Monday=1, etc.). The System then moves back through history for L weeks,comparing the similarity of D at L with D at L⁻¹, D at L⁻¹ with D atL⁻², D at L⁻² with D at L⁻³, and so forth. The signed sum of thedifferences in similarity provides a rate of over-all change inbehavior.

Actor week to week analysis operates as follows. A Terrain eventrepresents a single action being tracked in an Actor's terrain. Thefollowing is an example illustrating a sample event split into multipleTerrain events or actions. In this case the incoming event is a recordof an email sent by an actor:

-   -   {user: john@doe.com, time: 300000, recipientCount: 2,        recipients: “jane@doe.com, joe@doe.com”, emailSize: 1000,        emailAttachmentSize: 20000, src: “1.2.3.4”, dest: “4.5.6.7”}

In this case the incoming event can result in multiple Terrain events oractions as directed by the Data Dictionary. The System tracksrecipientCount, emailSize, and emailAttachment size. In this example theSystem has three terrainEvents or actions:

-   -   terrainEvent1: actorId: john@doe.com, time: 300000, src:        “1.2.3.4”, dest: “3.4.5.6”, field: “emailSize”, value: 1000,        key: 1234

terrainEvent2: actorId: john@doe.com, time: 300000, src: “1.2.3.4”,dest: “3.4.5.6”, field: “emailAttachmentSize”, value: 20000, key: 1234

terrainEvent3: actorId: john@doe.com, time: 300000, src: “1.2.3.4”,dest: “3.4.5.6”, field: “recipientCount”, value: 2, key: 1234

The System stores each of these events in its own Terrain. The Systemalso adds each event's value to the 15 minute cell identified by theevent's time.

Terrains maintain a matrix of compressed values for each event receivedduring a specified (e.g. 15 minute) time period. For example, if theSystem is tracking email size and a user sends ten emails (1000 peremail) between 9:00 and 9:15, the email size Terrain's cell for timeperiod 9:00-9:15 will have a frequency count of ten, a value of 10000and an average value of 1000 per event received.

A Terrain matrix is a Concept used by the System to aggregate Terraincell values into different groupings of cells that can be compared withother Terrain matrices. A matrix has two dimensions: x (days) and y(hour of the day 0-23). The number of days and hours of day occupied byeach cell depends on various factors and may be established by a user orby the system.

FIG. 5 is an example of a compressed matrix having ten days of data (xaxis) and eight periods of the day (four hours per period). Like the 15minute Terrain cells, each cell in the matrix of FIG. 5 maintains afrequency count and value for all events received during the time frame(day(s) and hours of the day). The data represented is the summary ofall data in the aggregated time period for a single Terrain which can becompared against another Terrain cell (belonging to the same or adifferent actor by the various analyses performed by the System).

Actor week to week analysis looks at an Actor's behavior Terrain byTerrain in week chunks to determine if the Actor's behavior is changingover time week to week. If there is enough of a change in the Actor'sbehavior over time, and the System emits or transmits anomalies. Forexample, such analysis recognizes when an Actor goes on vacation for aweek due to the lack of activity where there is normally activity. TheSystem does the following to accomplish this goal:

for each terrain in the Actor's landscape:

-   -   generate a week aligned terrain matrix aligned from        Monday-Sunday for a configurable number of weeks in the past        (eight for example)    -   for each of the aggregated terrains (week1-week8) compare each        cell from weekX with weekY (week1-week2, week2-week3, etc) and        derive a similarity CIX for all cells compared. Then generate an        overall similarity CIX for the week to week comparison (i.e.        week1-week2).    -   When all weeks have a similarity CIX value for the week to week        comparison, derive percent change. If percent change is above or        below an acceptable threshold (i.e. +30% or −30%) then consider        the Terrain anomalous and generate an anomaly signal.

For each Terrain compared, the System derives an overall similarity CIXby averaging the similarity CIX of every week to week comparison. Thenan overall similarity CIX is derived by averaging the similarity of eachover all terrain similarity CIX. If the Actor's overall average CIX isbelow an acceptable value, then the Actor is considered anomalous and ananomaly signal is generated indicating the Actor's behavior issignificantly different than what has been learned historically. Thismay occur if many Terrains are dissimilar over the period of timeanalyzed.

FIG. 6 visually depicts this week by week comparison.

The Actor day of week analysis is similar to the Actor week to weekanalysis except that an Actor's behavior for a day of the week in thepast is compared with N weeks prior to determine if the Actor behaved onthe day of week being tested like the same day of weeks in the past. Inthis case, for the day of week being compared (i.e. Monday), the Systemcreates two compressed matrices for each terrain. The first containshistorical data in the form of a terrain matrix and the second containsthe day of week being compared. The logic/processing for Actor day ofweek comparison is as follows:

for each terrain:

-   -   generate a compressed matrix of activity N weeks in the past        (starting one week prior to the week being compared)    -   generate a compressed matrix of activity for the week being        tested (i.e. last Monday)    -   generate a similarity CIX matrix showing the similarity of cells        from the historical matrix versus the week being compared and        derive an overall similarity by averaging all the similarity CIX        values for the terrain.    -   if the overall terrain similarity CIX is below an acceptable        value the consider the terrain anomalous and generate an anomaly        signal    -   generate an overall similarity CIX value by averaging all CIX        values of each of the terrain similarity CIX values generated.        If the overall CIX value is below an acceptable value, consider        the actor anomalous and generate an anomaly signal.

Semantic Rule Analysis element 207 is a rules engine underlying theprincipal analysis framework that encodes conditional, provisional,cognitive, operational, and functional knowledge. Rules are stored in anamed rule package. Inside the rule package, rules are organized in rulesets according to their execution priorities and other conditions.

The rules generally have the form:

when <premise> then <Action>

where premise is any set of fuzzy, crisp (boolean), or hybridpropositions possibly connected by an “and” or “or” element. The Actionsets the value of an outcome variable or the degree of compatibility(truth) in the global measure of knowledge across all the executedrules.

Predict Rate of Change element 208 is shown. Although the techniques aredifferent, the System benefits from studying the change in behavior overtime for peer to peer, Actor behavior change, and Actor correlationanalysis. By storing similarity and correlation results (including fuzzyclustering centroids) over time, the System can apply auto-regressionand polynomial regression analyses to various kinds of behavior historyin order to compute, with standard error of estimates (SEE), futurebehavior, the rates of change, and the rates of the rate of change (howquickly (or slowly) is change taking place). Such discovery of rates inchange and the acceleration (or deceleration) of this rate is useful indetecting and quantifying various layers of anomalous behavior.

Threat Detector 209 provides for the discovery and quantification of athreat (along with its severity, cohesion, and root cause), providing aprinciple outcome of the cognitive modeling application. The core ideais to learn how to separate a collection of anomalous behaviors from the(hopefully) smaller collection of anomalous behaviors that actuallyconstitute (or will constitute) a threat. In order to do this the Systemconsiders the forest of signals emitted during the underlying behavioranalyses (elements 201 through 208) as the support evidence for threatassessment. That is, the System applies data mining (against thesignals) and machine learning techniques (primarily clustering anddiscrimination analysis) to isolate and clarify real threats. Threatsare then classified (via a Bayesian network) into one of five severitylevels (conceptually in a spectrum similar to the military DEFCONapproach). Threats are also grouped by their type. Any number ofthreats, such as nine types of threats, may be classified.

The process of threat/threshold analysis compares action values and theaverage of action values grouped by time (into cells) against calculatedthreshold values. The System calculates the threshold value using on theactor's historical data to derive a baseline of ‘normal behavior’.Threshold or threat analysis can be divided into the following:

a. Threshold —a threshold exists for each actor's terrains (actor,action, process, field) for each day of the week and time of day. Foreach terrain there are seven day of week thresholds with six sets ofvalues (fuzzy numbers) each pertaining for a four period of the day(i.e. 00:00:00-3:59:59, 4:00:00-7:59:59, 8:00-11:59:59,12:00:00-15:59:59, 16:00:00-19:59:59, and 20:00:00-23:59:59). Athreshold is a fuzzy number, represented as a right facing S curve whichhas a domain min values (e.g. 50), inflection point (e.g. 60), anddomain maximum value (e.g. 70) which represents an actor's normal amountof activity as learned from historical data. As values approach andexceed the domain maximum value they are considered anomalous. Athreshold also has an outlier boundary. This boundary represents a valuethat is considered an outlier. The System calculates this outlierboundary using the standard deviation from events used to generate thethreshold multiplied by a configurable value plus the domain maximumvalue. The outlier boundary is typically greater than the threshold'sdomain maximum value and is a fuzzy number, represented as a rightfacing S curve just like the threshold. To summarize, each thresholdcontains the following values: domain minimum (bottom of S curve),domain maximum (top of S curve), and outlier boundary which is an Scurve that sits to the right of the threshold starting at thethreshold's domain max extending to the calculated outlier boundaryvalue.

b. Action Value Analysis—Action values are tested against the terrainsthreshold using the time of the event to determine which threshold tocompare (day of week and hour of day). In this case the System comparesthe actual value of the action (i.e. 100) against the threshold toderive a Compatibility Index (CIX). If the CIX is 1.0 (value is greaterthan or equal to domain max) then the value is said to be in violationof the threshold. In this case the System emits a signal. If the valueis less than domain max but greater than the near threshold alpha cut(0.65 for example) the value is considered to be a ‘near thresholdviolation’, in this case an anomaly if less importance that a thresholdviolation is generated. Last, in cases of a threshold violation, theSystem determines how much of a violation the value is by deriving a CIXagainst the outlier boundary threshold (S-Curve). If the action value isgreater than or equal to the outlier boundary (CIX 1.0) then the actionvalue is considered to be an outlier and the system generates an anomalysignal. If the value is greater than the threshold domain max value butless than the outlier boundary (outlier boundary cix is less than 1.0)then the derived CIX is added to the threshold violation signal toindicate how much of a violation the value is against the threshold.

c. Terrain Cell Analysis—Terrains maintain a matrix of compressed valuesfor each event received during a 15 minute (by default, but other timesmay be used) time period. For example, if the System is tracking emailsize and a user sends ten emails (1000 per email) between 9:00 and 9:15,the email size terrain's cell for time period 9:00-9:15 will have afrequency count of ten, a value of 10000. and an average value of 1000.The terrain Cell analysis compares the average value (1000 in thisexample) against the threshold generated for the day of week and time ofday to determine if the activity over the current time period beingtested is in violation of the threshold. The same tests described forevent values are performed in this case but using the terrain's valuesrather than the event value itself.

d. Unusual Activity Time—The threshold analysis can also identify eventsthat occur during an unusual time by looking at the threshold selectedto compare against the event and cell value. If the threshold has a 0value (i.e. no historical activity during this day of week and time ofday) then the event is considered unusual.

e. Unusual Asset Access—An event can have optional associated source anddestination assets. For example, an event can originate of a source(i.e. IP address) and terminate at a destination (i.e. IP address). Theterrain tracks all source and destination assets that events containedhave accessed. If an action originates from or is terminated by an assetnot in the list learned from historical information, the source asset ordestination asset is considered unusual and the System emits an anomalysignal.

Cognitive modeling component 210 (also known as the Extreme Vigilancecomponent) is designed for use outside the Environment. The outsideworld can communicate with the cognitive modeling component in two ways.Consuming signals is bi-directional and may provide the most extensivecommunications interface method. The REST Application Program Interfaceis a bi-directional item that connects any program to the cognitivemodeling component. The REST Application Program Interface has tworelated gateways. The primary gateway is to the Threat Detection processwhich returns threats and the threat's support chain. A secondarygateway interacts with cognitive modeling at the command level, allowingthe host application to issue commands and collect information.

The system also includes cognitive rule analysis functionality. Incognitive rule analysis, a rule package is a grouping of one or morerule sets and defines a logical grouping of rules that may be executedtogether. A rule set is associated with a taxonomy (Action, Process, andField) and defines the events to test against the rules in a rule set. Arule set includes 1 . . . N individual rules tested against eventsretrieved from an Actor's Terrain(s) to be tested. The Taxonomy can befully qualified: e.g. Transaction˜Beef˜TransactionAmt or can use a wildcard to match multiple terrains: e.g. *TxnAmt. A rule is an individualrule to be tested as part of a rule set either against an event orcollection of events (cell) for a period of time (hour, day, week,month). A rule consists of a condition to be tested (when clause), thenclause, and rule weight. The value tested is defined in the when clausewhich can contain operators:

-   -   value—The value from the event    -   count(hour)—The count of events from the event's terrain in the        current hour of the day (1:00:00-1:59:59)    -   count(day)—The count of events from the event's terrain in the        current day: (00:00:00-23:59:59)    -   count(week)—The count of events from the event's terrain for the        current week (Sunday-Saturday)    -   count(month)—The count of events in the event's terrain for the        current month of year (1st-last day of the current month).

Context is generated using historical information from the Terrain(s)and is used to compare the value to determine which term the value(event value or count) falls into. The rule compares the value testedagainst the context to determine compatibility of the value with thecontext's concept called out in the rule. For example, a rule that says“value is low” compares the event's value against the context's conceptnamed low′ to determine compatibility (CIX) with the concept's value.

The rule analysis runs against a model and rule package and time range.Processing may operate as follows:

for each Actor in the model for each rule set:

-   -   retrieve events that match the rule set's terrain(s) (absolute        or wild card)

for each matching event

-   -   for each rule in the rule set        -   retrieve the valueToTest called out in the rule (event value            or count over time)        -   compare the valueToTest with the context's concept called            out in the rule and derive a CIX        -   Add cix*ruleWeight in a variable called cixSum        -   Add rule weight to a variable called weightSum        -   derive overall CIX by: cixSum/weightSum        -   if weight is less than an acceptable value (i.e.: 0.20) then            the event is considered anomalous to the ruleSet, emit an            anomaly signal

Cognitive Signals

Cognitive Signals are discussed here at a relatively high level.Cognitive signals provide bi-directional intra-applicationcommunication, inter-product communication, and application to clientcommunications. Signals are automatically generated by all functional,analytical, and operational components of the Cognitive Model platformas well as every cognitive modeling application in the System. Clientsmay also generate, store, and use their own signals. The rule languagewill emit Cognitive Signals when specific rule conditions are satisfied.

Cognitive Signals (signals) are emitted by and consumed by all theanalytical processes in the Cognitive Modeling (CM) platform as well asby all features built on the CM platform. While signals share manycommon attributes of audit entries and events, they constitute aprivileged layer in the System's cognitive modeling operational andfunctional capabilities. Signals form a much deeper and morefundamentally integral fabric within the System.

The entire cognitive modeling platform as well applications built onthis platform generate a wide spectrum of “Cognitive Signals.” Thesesignals are called “cognitive” because they reveal the underlying,time-ordered sequence of operations used to learn behavior patterns andthen apply these patterns to detect trends, changes, anomalies, hazards,and threats. The signals are continuously generated inside the CognitiveModeling platform by the execution of qualitative expression enabledmachine learning, statistical analysis, knowledge-based policies(cognitive rule packages) and other computer science, heuristic, andartificial intelligence services.

Signals are consumed by different components of a CM-based feature, areconsumed by different CM features (as one example, signals from thecognitive modeling for Behavior application can be used by cognitivemodeling Anti-Fraud), are consumed by the component that actuallyemitted the signal (they are reflective), are consumed by clientapplications, and are consumed by commercial data analysis packages.

Signals are grouped according to their source. The most common sourcesinclude SYSTEM (application execution states), AUDITING (applicationcontrol flow), DATAANALYSIS (incoming event and data model Actions),ORDERANALYSIS (learned Actor sequence of operations), PEERANALYSIS(Actor peer-to-peer similarity analysis), RULEANALYSIS (rule analysisActions), HAZARDANALYSIS (generated Actor or asset hazards),THREATANALYSIS (critical enterprise threats). As new functionalities areadded to the CM layer, existing application refined or extended, and newapplications are developed and deployed, additional source types may bedeveloped, extended, or combined.

A set of categories captures the state of the analytical at variouspoints in the analysis process. The prolog category captures the stateof the global or local model just before algorithms are executed orother state changing functions are applied. The details categorycaptures the information about data and statistics as the analysisprogresses, the Action category captures the type, severity, and anomaly“trigger” states for any anomaly, hazard or threat; and the epilogcategory captures any important state changes that exist beyond theemission life of the signal.

In this context, the process of Hazard detection involves performingstatistical analysis against a set of Actor Anomalies that occurredduring the analysis time period, also known here as the “threat window.”Inputs to the Hazard detection process are a set of model-specific ActorAnomalies that occurred during the threat window time period (e.g. 24hours). Every signal has a severity assigned, the severities in low tohigh severity value may include warning, caution, alert, and severe.

Hazard detection processing operates as follows:

-   -   Derive resolutionPeriodHours with its threatWindowHours (24)        divided by number of resolution periods (e.g. four), in this        case each resolution period would be six hours.    -   Create a matrix of signals organized by severity and resolution        period (which is a sub-set of the threat window time period. The        matrix has Hours on the X axis and signal severity on the Y        axis. The matrix maintains a count of signals during the        resolution period at each severity. Below is an example of the        matrix:

00:00 < 6:00 06:00 < 12:00 12:00 < 18:00 18:00 < 24:00 # Warning #Caution # Alert # Severe

-   -   Derive the signal promotion mappings which defines how many        signals in a resolution period of each type (warning, caution,        alert, severe) it takes to cause the severities of those signals        to be promoted. This is done as follows:

a. Get severity value mappings:

m_severityValueMappings [scmSignal::Severity::UNDEFINED] = 0;m_severityValueMappings [scmSignal::Severity::INFORMATION] = 0;m_severityValueMappings [scmSignal::Severity::WARNING] = 2;m_severityValueMappings [scmSignal::Severity::CAUTION] = 3;m_severityValueMappings [scmSignal::Severity::ALERT] = 4;m_severityValueMappings [scmSignal::Severity::SEVERE] = 5;m_severityValueMappings [scmSignal::Severity::FATAL] = 0;

b. Get severity weight mappings:

doublem_warningSeverityWeight = 1; doublem_cautionSeverityWeight = 5;doublem_alertSeverityWeight = 20; doublem_severeSeverityWeight = 50;

-   -   Get hazard promotion multiplier which is the multiplier to apply        to the signal promotion algorithm (the default value may be, for        example, 0.25) multiplied by the number of hours per resolution        period (e.g. six)    -   Get severity promotion mappings for each of the severities:

doublesignalCount=((double)((maxSeverityValue-severityValue)+1))*promotionMultipler;

The following is log output from the Hazard detection process usingdefault values showing the number of signals required at each severityto cause promotion:

Severity Promotion map—derived max signal severity value: 5

Calculated signal promotion count 6.0000 for severity: WARNING

Calculated signal promotion count 5.0000 for severity: CAUTION

Calculated signal promotion count 3.0000 for severity: ALERT

Calculated signal promotion count 2.0000 for severity: SEVERE

-   -   Perform signal promotion which entails promoting the severity of        signals during a resolution period window (time period) where        the number of signals exceeds that severity signal promotion        count. After promotion the signal count matrix above may be        adjusted (if necessary) with counts of signal per resolution        period.

With the signals promoted, calculate risk exposure for each terrain. Anexample of pseudo code for this process is as follows:

for each Actor Terrain with an Anomaly signal  for each resolutionperiod   organize signal counts in a matrix by severity (warning,caution, alert, severe)   for each severity, calculate risk:    Derivebias:double bias = (double)(biasWeight*(double)count);   Calculateresolution period severity risk: severityRisk = (severityBias /biasSum);   Add the resolution period severity risk to theseverityRiskSum that maintains total risk per severity.    for eachseverity (warning, caution, alert, severe)     averageSeverityRisk =severityRiskSum / numTerrainsWithNonZeroRisk     bias the severity riskusing severityWeight: biasedSeverityRiskAverage = (severityWeight *severityRiskAverage)     track the sum of all severity average risks:severityBiasedAverageSum += biasedSeverityRiskAverage    for eachseverity (warning, caution, alert, severe) derive severityBiasedRiskaverage:     Derive severityBiasedRisk: severityBiasedRisk =(severityBiasedRiskAverage / severityBiasedAverageSum)     store theseverityBiasedRisk in matrix with severity on Y axis and biased risk onthe X axis    Derive over all Actor risk by summing strong severitybiased risk (alert and severe): actorRisk = severityBiasedRisk[alert] +severityBiasedRisk[severe]    Emit Actor Hazard signal reporting theoverall actorRisk store and all details gathered for the resolutionperiod

When the process is complete, a series of hazard signals may be emittedby the System, one for each Actor with (at least one) Anomaly signalduring the resolution period reporting the derived Actor risk score.These signals may be used as input for the threat detection process.

Threat Detection processing may operate as follows. First, the Systemidentifies the unique list of resolution periods identified in allHazard signals received as input. A resolution period in the Hazarddetection process takes the threat window (24 hours for example) anddivides that time window into small chunks of time (for example, sixhours). For Threat Detection, the resolution periods found in ActorHazard signals is identified for use. The following is an example ofprocessing that may occur for Threat Detection:

  for each resolution period found in hazard signals   timePeriodAssetList = retrieve list of assets associated with eachhazard signal found during the resolution period   timePeriodActorHazards = retrieve list of all actors and hazardsignals that occurred during the resolution period    for each asset intimerPeriodAssetList     assetDerived Risk = Get the maximum risk valuefrom all actor hazards that reference the asset     scaledAssetRisk =assetDerivedRisk * (assetRisk/100)     if scaledAssetRisk >minimumAssetRiskThreshold      generate Asset Threat Signal   for eachactor hazard signal    scaledActorRisk = hazardActorRiskScore *(actorRisk/100)    if scaledActorRisk > minimumAssetRiskThreshold    generate ActorThreat signal

The cognitive modeling line of features and services is built on theCognitive Modeler (CM) platform, which consists of a tightly integrated,yet loosely coupled, collection of machine intelligence and advancedcomputer science sub-systems. The underlying technology used in thesecognitive modeling features is, in most cases, hidden from the user.This is the case in any cognitive modeling application. From the user'sperspective, the feature, as shown in FIG. 3, has three elements.

FIG. 3 shows a first high level view of cognitive modeling. In FIG. 3,there are events going into the model at point 301, signals coming outof the model at point 302, and a command language that processes theoutgoing signals at point 303. FIG. 3 is illustration is a bitsimplified. However, there are also commands that select the incomingevents as well as commands that run various phases of the model. Butfrom a client perspective the input-output-analyze relationship is theprimary functional sequence. Understanding the role of events andsignals in the application is important in understanding the kinds andbreadth of information generated by a population of signals at any pointin time.

FIG. 4 expands the interplay of signals among feature commands as wellinto the client's own applications and models. FIG. 4 thus represents afurther cognitive modeling high-level overview. From FIG. 4, element 401represents a stream of events flowing into the cognitive modelingfunction or application. This is the only point where events enter theSystem (through, for example, an AcquireEvents( ) command) Subsequentprocessing involves signals (which may be (but are not necessarily))transformational representations of one or more events. Element 402represents cognitive modeling behavior and Action modeling componentsdigest the events. These events are used to initially train the machinelearning models and then, when the cognitive modeling model is placedinto operation, the events are used to learn a wide variety of unusualbehaviors. The detection of an unusual behavior generates one or moresignals. Element 403 indicates the signals generated by the behaviormodels inside cognitive modeling describe the nature of anomalousbehaviors relative to normal behaviors. These signals are consumed by avariety of application analytical commands. These commands often combinethe information from many signals order over time to generate their ownset of signals. From element 404, a few of these signals occasionallyform an adaptive or cyclical feedback loop with the core modelcomponents.

At point 405, the feedback process generates a set of Actions within themodel, which, in turn, produces another set of output signals (element402). For element 406, for signals that do not form a feedback loop intothe behavior models, the cognitive modeling analytical commands whichhave consumed the behavior model signals, also perform a wide spectrumof analyses and generate their own set of output signals. From element407, these next generation signals are consumed by other commandsfurther down the “food chain.” From element 408, the entire.time-varying signal output from any of the cognitive modelingapplications and services are available to the client. These files canbe consumed and used by any client processes.

Cognitive Signals are not only used as the backbone for advanced machinereasoning applications, but they provide tools to develop extensions,integrate cognitive modeling with strategic corporate or agencyapplications, and perform additional machine learning analyses. Ingeneral, Cognitive Signals can be used to build vital links betweensemantic-based machine intelligence services and a common client toobtain more knowledge out of even the most powerful machine learningenvironment. Hence, Cognitive Signals constitute an audit of all theoperations underlying the computing and machine intelligence operationsof the applications. Cognitive Signals maintain and expose, in terms ofresults, the qualitative semantic (fuzzy logic and linguistic variable)nature of our machine intelligence functions (the actual algorithmicprocessing is not exposed). Signals support the necessary time sequenceinformation for deep root cause analysis (this is how cognitive modelingdoes root cause discovery and the signals can be used for custom rootcause analysis). Signals provide critical information for modelperformance analysis. Signals support (and for hazards and threats,contribute to) the analysis of information density (trustworthiness).Signals provide the flow of state information between cognitive modelingfeatures and applications. Signals connect components within a cognitivemodeling feature or application. Signals support client connectivitybetween features or applications using semantic rules (rules can emittheir own client-configured signals). Signals provide connectivity toother systems, such as for advanced analytics and customer analyses.Signals provide the ability to integrate cognitive modeling applicationsinto external enterprise applications. Signals can be explored usingalmost any data management or model tool. Signals can be investigatedusing other machine learning and commercial statistical analysisapplications (such as SAS). Signals are used in the model configurationprocess and in semantic (linguistic variable) optimization.

The stream of signals provides information and insight into what ishappening inside the model as well as the results of the analysesperformed inside the model. That is, the only things the outside worldcan know about the functioning of a model are made visible through thesignals. This includes GUI interfaces.

Computing Trustworthiness

One question is how to trust the System's behavior modeling analyses andpredictions. Fundamental to Cognitive Modelling capabilities is themeasurement of trustworthiness. The following discusses the issues withevolving a measure of trust in a model.

The wide-spread and very deep use of fuzzy logic as the underlyingepistemological algebra in our cognitive models means thattrustworthiness of the models can be assessed based using the degree towhich they conform to a set of key performance metrics. The mostimportant metrics are associated with four principles.

The first three, density, volatility, and similarity are scalar values.All three of these values are ultimately measured as elastic tensors(fuzzy set membership values), but the use of similarity (a uniqueproperty derived from fuzzy mathematics) removes one dimension of themetric from probabilities, Euclidean distances, and ratios. The fourthprinciple is periodicity and defines the recurring time-varying patternsof the data over different time horizons. Periodicity accounts forchanges in density and volatility of data over different time frames.

The time-varying conventional and fuzzy mathematics used to ultimatelyderive a trust measurement is too complicated and requires significantknowledge (in fuzzy set theory and information theory) and is beyond thescope of this disclosure. The purpose of this disclosure is to describethe ideas and concepts brought together in order to consistently andreliably compute a trustworthiness value. Like all critical metrics inthese models and devices, trust is a value in the range [0,1]. Theunderlying response curve is sigmoidal (S-shaped) with an inflectionpoint around [0.80]. Since the majority of trust resides in the densityand volatility values and similarity and periodicity are rather complexmathematical properties, this overview deals only with Terrain and graphdensity and volatility.

How can we trust the System's behavior modeling analyses andpredictions? Fundamental to the Cognitive Modelling capabilities is themeasurement of trustworthiness. This metric is based on the idea ofinformation entropy—that is, the amount of “noise” in the System. Ininformation theory “noise” has several meanings, but we pull togethertwo concepts: density and volatility.

Density is the average information saturation in a Terrain or graph andreflects the idea of potential versus actual data. Density in thiscontext is the fundamental measure of noise. The less data available inTerrains and graphs (and other machine learning structures includingclusters and classifiers), the more empty spaces available in thestructure that encodes information (and, ultimately, knowledge).

Volatility, as the name implies, is a time-varying property of the datathat reflects the rapidity of change in the fundamental statisticalproperties of the data. In terms of trustworthiness, the System measuresboth the rate of change (the first derivative over time) and the rate ofthe rate of change (the second derivative over time). The higher thevolatility of the data the higher the potential noise (as measured by astandard error of estimate from one time period to the next).

Hence, noise is the ratio of potential knowledge to actual knowledgescaled by the data's volatility (basically density×volatility). Noise isrepresented as the metric Su(t) in the System (since it is related tothe idea of uncertainty over time in Shannon information theory. AsShannon said in his 1948 paper, “The conditional entropy Hy(x) will, forconvenience, be called the equivocation. It measures the averageambiguity of the received signal.” This idea of information ambiguity iscentral to the trusted model measurement.

The Cognitive Modeling platform is built on the concept that the degreeof trustworthiness can be measured and displayed in the System'salgorithms and (as a consequence) system pattern discoveries andpredictions. As noted, trustworthiness is based on a version ofinformation entropy (the ratio of noise to data in the System). Inessence, trustworthiness is based on amount of data available in therelevant ecologies versus how much potential data could be available.How the System measures amount of “potential” data available is bothimportant and difficult.

In cognitive modeling, the System seeks to measure trustworthinessacross operational dimensions: model, ecology, and classifier. Eachdimension is enclosed by a higher level dimension, and thus a modelcontains a set of ecologies and an ecology contains a set of Actorlandscapes, which, in turn, contain the basic machine learningstructures (Terrains and Graphs).

The level of noise in Terrains and Graphs introduces ambiguity andinhibits learning, with precision, not only the actual behavior ofActors, but also inhibits ability to learn the active work periods forActors (both human and machine). The less non-volatile data available inTerrains and Graphs (and other machine learning structures includingclusters and classifiers) the more actual or virtual empty spaces areavailable in the structures that encode information (and, ultimately,knowledge). A virtual empty space is a space with data but whosestandard error of estimate (SEE) is higher than the statistical limitversion of r.

Terrain trust—One obvious, but simplistic, measure of Terrain density(dT), as an example, is based on a relatively straight-forward weightedaverage of the amount of data found in a Terrain:

$d^{T} = {\frac{1}{r \times n}{\sum\limits_{i = 1}^{n}\; {\sum\limits_{i = 1}^{r}\; {T\left( {r,n} \right)}}}}$

Where,

-   -   r is the time of day increments in the terrain    -   u the days over which the density should be computed    -   T( ) a terrain manifold

However, the weighted average amount of data in a Terrain isinsufficient in understanding the ratio of actual data to potentialdata. For this we are concerned with the number of (time of day, day ofyear) cells on the Terrain that are occupied (regardless of themagnitude of the data) versus the total number of available cells. Forthis we define a measure Su(t)—the Shannon local information entropy (u)and time (t) for a particular landscape. Su(t) is a better, but stillstraightforward metric is:

${{S_{i}^{u}(L)} = {\frac{\sum\limits_{n = 1}^{K}\; {\exists\left( {{T(n)} > r} \right)}}{\sum\limits_{n = 1}^{K}\; {\exists\left( {T(n)} \right.}} \times {T(n)}}},$

Where,

-   -   K the total number of terrains in the landscape    -   ∃( ) for each element in (x); the count of( ) function    -   T(n) the n^(th) terrain in the system    -   r the minimum data cell density to qualify as “not empty”    -   T( )_(v) the volatility associated with the n^(th) terrain

The System uses the local information entropy to easy compute the globalinformation entropy (Du(t)). This is simply:

$D_{t}^{u} = {\frac{1}{N}{\sum\limits_{i = 1}^{N}\; {S_{i}^{u}(i)}}}$

Where,

-   -   N is the number of local (landscape) entropy values    -   S^(u) _((i)) is the i^(th) landscape entropy

Hence, the simplest global System entropy (trust) is average of thelandscape trusts. This can be adjusted based on other factors such asthe criticality of the Actor associated with Su(t), the averagecriticality of the Terrains, the number of Terrains, and the behavior ofthe Actor relative to the Actor's peer group (which is also affected bythe entropy of the collected Terrains across peers). The samecalculation is used to compute the global trustworthiness for graphsfrom the local graph entropy.

With respect to graph trust, like Terrains, a graph settles into a stateof operational information density (Su(t)). Like Terrains, if the Actorchanges behavior, then the information density of the System (this isstill a form of Shannon Information Entropy) changes until it can settleonce more in a steady state.

In a collection of Terrains, the System measures local InformationDensity (Su(t)(L)) by how much of the discovered work area is populatedwith events (the average Terrain density). If a Terrain is lightlypopulated, the Terrain lacks enough data to give the machine learningand prediction engines sufficient precision and clarity to make reliable(or effective) decisions.

A graph suffers from the same bias when faced with a lack ofinformation. Information in a graph is reflected in the magnitude of themean probability binding the nodes. Hence, in a graph the correspondingInformation Density is the average inter-node (that is, edge)probability of the entire graph above specific threshold. The followingequation (somewhat simplified to exclude probabilistic volatility) showsthis:

$S_{i}^{u} = {{\frac{1}{N}{\sum\limits_{i = 1}^{N - i}\; {\sum\limits_{j = 1}^{K}\; {p\left( e_{i,j} \right)}}}} - \left( {\frac{1}{N - r}{\sum\limits_{i = 1}^{{({N - r})} - 1}\; \left( {{{g(i)}{\sum\limits_{j = 1}^{K}\; {p\left( e_{i,j} \right)}}} \geq m} \right)}} \right.}$

Where,

-   -   N is the number of nodes in the graph    -   K is the number of outbound edges for node(i)    -   Re) is the probability associated with edge(j) for node(i)    -   rexcluded node count    -   g(i) is a selector function that chooses the is i^(th) node that        has not been excluded    -   m an edge's minimum probability threshold for being selected

Thus the higher the residual mean probability of the entire graph, themore data underlies the transition states. The parameter m is theminimumIDProbabilityThreshold property and excludes probabilities thatare less than this value. A default value may be provided, such as[0.10]. The parameter r is the number of edges excluded because they arebelow parameter m.

The rationale for this is as follows: this approach first calculates themean probability on every edge in the graph (call this T). Then the meanprobability of every edge that is above a given threshold is calculated,and this value is called r. Then, if m is selected correctly, Δ(T,R) isthe actual information density in the graph.

With respect to graph stability and steady-states, when new behaviorsare introduced into the graph (hence it becomes a time-varying unboundedsystem) any steady state will begin to fluctuate. But this steady-statefluctuates around the changes in either the activity on existing nodesor the introduction of new nodes (Taxonomies) or both. ComputingInformation Density as graph stability decays is therefore beneficial.This is where comparing learned behaviors and new behaviors providesinsight into the emergence of new time-varying behaviors (which start asanomalies and then become normal behaviors). As new behaviors reach asteady state, they form the behavior platform from which anomalies inthe new data are detected.

With respect to trust semantics, the metric Su(t) or Du(t) will be avalue in the range [0,1] and can be treated as a fuzzy compatibilityindex (CIX). Thus it is associated with a Context (TRUSTWORTHINESS) toprovide the application user with a semantic measure of trust (such as,VERY HIGH, HIGH, MEDIUM, LOW, VERY LOW). Through some user interactionor based on the context or some other set of Actors the actual value ofdensity and volatility (as well as their semantic measure) can be orwill be displayed.

With respect to trustworthiness, a better way to compute actualinformation density is based on the probabilistic subset of the Terrainthat contains periodic work patterns. That is, information densityreflects an Actor's actual working days of the week and hours of theday. Calculating the work-specific or active information densityrequires determining which days of the week are actual work days for theActor.

This pattern based information density generates a work pattern arrayW[dow][p] where the day of week (dow) dimension has a dimension ofseven, for example (Monday through Sunday); and the period (p) dimensionhas a cardinality of P. The first step is completing W_(t,j) for eachactor using the terrain history data. The process appears as:

$W_{P{(j)}}^{t} = \left. {\sum\limits_{i = 0}^{N}\; t}\leftarrow{{{dow}(i)}{\sum\limits_{j = 0}^{P}\; {\forall\left( {{c{\lbrack\rbrack}}_{i,j} > 0} \right)}}} \right.$

where:

-   -   W^(t) _((j)) is the weekday density matrix for the actor    -   N is the number of days in the terrain    -   t is the day of week corresponding to the day of the year (N)    -   dow( ) is a function that returns the day of week number of a        given N    -   P is the number of periods in a day    -   c[ ]i,j the cell for the ith day and jth period    -   ∀ is the count-when( ) function

When complete this will produce an 7×P array for the Actor indicatingthe count of times a value was detected on a particular day of the week.The count of the number of times may be stored in the period dimension.In order to reduce the occurrence count to an incident pattern, theSystem normalizes the values:

$W_{P{(i)}}^{t} = \frac{W_{p{(i)}}}{\max \left( W_{p()} \right)}$

That is, the count of the number of cells in the working pattern Terrainsubset divided by the maximum cell value in the working pattern Terrainsubset. The challenge is to define this rectangular subset of thecomplete terrain that reflects the probable working times and days ofthe Actor. Table 1 illustrates a Terrain for an Actor whose work hoursand days are non-standard.

TABLE 1 An Actor Terrain Time of Day 2200 1 2100 4 5 1 2000 8 1 3 5 21900 14 10 10 13 16 12 15 34 11 1800 16 11 9 14 22 17 16 22 24 29 170012 15 8 14 19 26 10 14 17 19 1600 1 2 1 8 1500 8 23 21 14 12 20 31 44 91400 33 11 22 17 13 10 12 16 21 1300 14 26 18 12 12 12 22 12 24 29 120016 11 19 24 7 10 17 14 19 1100 3 1 1000 2 1 3 5 34 35 36 37 38 39 40 4142 43 44 45 46 M T W T F S S M T W T F S Day of the Week

This Actor works eight hours from noon to 7 pm on Wednesdays throughSundays. He/she takes a lunch or early dinner break at 4 pm. To discoverthis pattern, we need to understand the surface profile of theTerrain—that is, a Terrain with data is a three dimension surface (thedata forming the height).

Step 1. Find probable Work days for an Actor. The System looks for daysof the week that have concentrated activity patterns. If the Actor'swork week is “crisp” enough then every day that is not a work day willhave an activity count of zero. This will probably be mostly the case,but we can sharpen our focus by removing noise or small periodic workevents from consideration (this is called “unvoting” the day). Thisprovides PWD (Probabilistic Work Days).

The System uses a reinforcement approach in isolating the work days. TheSystem runs through a deep set of history days, counting the number ofnonzero cells in each column (hours of the day). The System accumulatesthese counts in a probable day of week histogram (W[ ]).

Begin: Find Working Days 001  let T[ ][ ] = the terrain 002  letW[0,1,...6] = possible work day 003  let unvoteThreshold = .3 004   Foreach dayOfYear, d = 1, 2 , 3, ....,m 005    for each periodOfDay, p =1,2,3, ..., hoursInDay*intervals 006     dow = DOW(dayOfYear) 007     ifT[d,p] > 0 008      W[dow] = W[dow] + 1 009     end if 010    end foreach periodOfDay 011   end for each dayOfYear

The count in the day of week window (W) acts like a vote for this day ofweek as a working day. To find out if this vote is valid the Systemdetermines how strong it is relative to all the other votes. Thus theSystem measures the ratio the sum of all the votes for a day or weekagainst the day of week with the maximum number of votes:

012  maxVote = −1 013  for n = 1 to 7 014   if W[n] > maxVote 015   maxVote = W[n] 016   end if 017  next n

The System then computes the ratio and check to see if it is above theunvotethreashold. If so, the System marks the day of week as a work day.

018  for n = 1 to 7 019   r = W[n]/maxVote 020   If r > unvoteThreshold021   WorkDay[m] = true 022   end if 024  bext n

Step 2. Find probable Hours of the Day an Actor Works. This is thehorizontal analysis to the work day's vertical analysis. The Systemlooks for the hours of the day that have concentrated activity patterns.If the Actor's work hours are “crisp” enough then every hour that is nota work hour will have an activity count of zero. Unlike work daydiscovery, work hours will generally have a high degree of noise (sincepeople come in early and stay late on a work day, but hardly ever comein on a non-workday). Similar to work day discovery, the System cansmooth out or soften focus by removing noise or persistent periodic workevents from consideration (this is called “unvoting” the time of day).This yields PWH (Probabilistic Work Hours).

Unlike work days, the boundaries of work hours are not so preciselydefined. Work hours a generally quite elastic. It is quite common forpeople to come in early or stay late to catch up on work, socialize,assist in other projects, and a whole range of other reasons. Aneffective behavior modeling system must take this imprecision in workhours into account. That is, if a system models a person's hours exactly(find that eight hour window) then that system may miss the fog or foamaround the start and end hours that are, for all reasonable purposes, anintegral part of the person's work day. To model the near outliers on awork day, the present System lowers the unvotethreshold to a lowernumber between 0 and 1, such as 0.15 or 0.20.

When the System has identified the Actor's day of the week work days andthe Actor's hour of the day work times, the System has reduced a Terrainto the core information region (IR). It is this region (see Table 2)used to calculate information density. The System also uses this regionto compute native or organic thresholds,

TABLE 2 The core Information Region Time of Day 1900 14 10 10 13 16 1215 34 11 1800 11 9 14 22 17 16 22 24 29 1700 12 15 8 14 19 10 14 17 191600 1 2 1 8 1500 8 23 21 14 12 20 31 44 9 1400 33 11 22 17 13 10 12 1621 1300 26 18 12 12 12 22 12 24 29 1200 16 11 19 24 7 10 17 14 19 36 3738 39 40 43 44 45 46 W T F S S W T F S

The core information region adequately represents any kind of Actor. Asan example, a server (machine) can run 24 hours a day seven days a weekso its IR could possibly be the entire Terrain. A point of sales machinemight run only when the associated retain store in open (16 hours aday). But the same subset IR identification algorithm will likely workfor every case.

The idea of information density and Shannon Entropy is critical tomachines making decisions based on different degrees of evidence. If aclient does not have enough information in their landscapes andecologies, then the behavior modeling mechanisms cannot make accurateclassifications and predictions. Information density is assessed insignals derived from Thresholds and Graphs (which have their own form ofinformation). This ID measurement is employed to calculatetrustworthiness and reliability metrics all the up the anomaly to threatchain of custody path.

Trust Processing

001  begin DiscoverActiveRegions 001   let Hd = history depth for activeterrain region 001   set Hd = property.historydepth 001   for each actorclass, C_(k) = c1, c2, c3, ..., cR 003    for each actor, A_(j) = a1,a2, a3, ..., aM 001     for each terrain, T_(i) = t1, t2, t3, ...,tN 002     discover active (work) region for T_(i) using Hd 002       see:Step 1. Find probable Work days for an Actor 003       see: Step 2. Findprobable Hours of the Day an Actor          Works 003      store resultsin W[ ][ ] for terrain T_(i) 003     end for each terrain 004    end foreach actor 005   for each class 006  end DiscoverActiveRegions

Having the work region for each Terrain associated with an Actor in aspecific actor class, the System can compute amount of overage in thework area as the ratio of actual coverage to potential coverage. This isthe landscape trust.

001 begin Compute Over-all Actor Trust 002  for each terrain, Ti = t1,t2, t3, . . ., tN 003  compute terrain densityRatio (dr) based on: 004  active region data, active region capacity$d_{r} = \frac{\forall{{{c_{wr}\lbrack\mspace{11mu}\rbrack}\lbrack\mspace{11mu}\rbrack} > 0}}{\forall c_{wr}}$005   store with terrain Ti 006  end for each Terrain 007  computeaverage dr 008  return wgtdavg(dr) as actor trust 009 end ComputeOver-all Actor Trust

Instead of a normal statistical mean, the trust metric can be responsiveto the importance of the Terrain using, for example, a weighted average.A weighted average captures the idea of biasing the trust based on thecoverage of critical terrains (or even critical Actors). Such a trustmetric related weighted average is computed by:

${\overset{\_}{d}}_{r} = \frac{\sum\limits_{i = 1}^{N}\; {{dr}_{i} \times T_{i}^{w}}}{\sum\limits_{i = 1}^{N}\; T_{i}^{w}}$

where T^(w) _((i)) is the criticality weight (or other categorical orsemantic weight) associated with the current Terrain. In this waytrustworthiness incorporates the idea that some Terrains need a higherdegree of data coverage than others.

The trustworthiness of a model may be the weighted average of thelandscape d_(r) averages (that is, Actor trustworthiness) for each ofthe Actors in each of the Actor categories. In this way the System canalso assign a weight to each of the Actors, identifying Actors having ahigher degree of criticality or cost or intrinsic instability (such asnewly hired Actors, regardless of their Terrain coverage).

Compute Servers

The device operating the concepts disclosed herein is a compute serverarrangement. Compute servers are hardware employed to perform specificfunctionality in this realm, primarily to address the problem of how toanalyze a “perceived infinite” amount of data to better understand thebehavior of the critical resources of the environment. Thus the serversused to compute the behaviors and concepts enclosed include hardware andsoftware for scaling algorithms across an actor population. The systememploys a series of knowledge nodes, where the compute serverarrangement performs multi-pass analysis as part of the overall process.The system employs mesh computing, and processing on the system isdriven by a supervisory node. Overall, the analysis performed by thesupervisory node and the knowledge nodes is driven by a taxonomicnormalization of the data, where the data is exceedingly large orseemingly infinite.

The compute servers perform four general functions, including normalize,migrate, supervise, and analyze, in order to scale the performance ofmultiple disjoint algorithms across an actor population. The computeservers normalize data using a common taxonomy, distribute normalizeddata across at least one but typically many knowledge nodes, supervisethe execution of algorithms across knowledge nodes, and collate andpresent results of all analyses. Normalization performed by the computeservers is to provide data according to a common collection of fields.The collection of fields may be driven by data or driven by theinformation sought, i.e. established a priori.

Multiple knowledge nodes may be provided, and in typical cases actorsare distributed as evenly as possible between the knowledge nodes.Knowledge nodes are set up as a web, with an ability to obtain and/orshare data between knowledge nodes. With respect to the commoncollection of fields, these are represented by a taxonomy, representinga common language understood by all knowledge nodes such that data canbe shared between nodes. And again, a supervisory node manages theknowledge nodes.

More formally, taxonomy is the science of naming, describing andclassifying objects. In this instance, an object represents information(either an event, a state change or other piece of knowledge). Thepresent system employs a particular taxonomy as follows:

Actor|Action|Process|Field|Value|Reference Key|Source Asset|DestinationAsset

An individual entry (a collection of one value for each field in thetaxonomy) is called a row. Most of these fields are flexible in terms ofwhat they represent. In general, an Actor performs an Action via aProcess whose results are located or placed in a Field containing aValue. The Action performed by the Actor may come from a Source Assetand may be applied to a Destination Asset. The Reference Key contains avalue that ties a present Row to other Rows, thereby providing a methodfor combining the results of separate Row analysis if necessary.

The Actor is the resource being analyzed. A Data Dictionary is themapping between information, usually events, and at least one taxonomy.A Cognitive Mode is a collection of Data Dictionaries. A Knowledge Nodeis a combination of a data store and a compute resource for runninganalyses. Knowledge Nodes are connected in a mesh fashion, i.e. eachKnowledge Node can connect to every other Knowledge Node to extract dataand execute commands. This mesh is known as a Knowledge Web.

The Knowledge Web is the collection of Knowledge Nodes connected in aweb fashion, where the Supervisor or Supervisor Node controls theKnowledge Web. The Supervisor provides instructions on what to do toeach Knowledge Node, when to do a task or action, and where to returnthe results from the task or action(s). The Supervisor collates theresults of any actions in a Knowledge Web and stores and/or displaysthese results as desired or necessary.

Order of execution is as follows: Normalization, Distribution,Supervision, and/or Presentation. With respect to Normalization, thesystem executes a command to extract/acquire information from a datastore as specified in the Data Dictionary entries in a Cognitive Modelover some time period. The system then maps this information to thecorresponding fields in the Taxonomy.

Distribution comprises randomly distributing the taxonomies ofinformation to one of the available Knowledge Nodes, which then storesthe Row. The Supervisor node and/or at least one Knowledge Node knowsthe location of the Row, i.e. Row 10489330 is on Knowledge Node Omicron.Supervision entails the Supervisor node coordinating analyses, either ina scheduled or ad-hoc fashion. Presentation entails the Supervisor nodecollating results so the results may be presented in a mannerappropriate for a specific user.

An example of this arrangement is a so-called “supermarket-to-table”example, where a seemingly infinite amount of food is available in aseemingly infinite number of supermarkets and able to be assembled usinga seemingly infinite number of recipes for a seemingly infinite numberof people, with the goal of the system is to provide assembled meals toa number of persons, perhaps in the thousands. With shipping availableand economically feasible in certain situations, the number of possiblesupermarket sources or resources is exceedingly large and apparentlyinfinite. Also, with variations available to any combination ofingredients, the number of recipes is also seemingly infinite. In thisexample, a knowledge node may be geographically oriented, i.e.addressing a certain number of diners in a region, wherein thesupervisory node task is to provide an assembled meal at a meal time fora group of people, in this example within the geographic region. A datastore may include significant information about the diners in questionand the supermarkets and recipes available, and such information may benormalized. Information may include diner preferences, allergies,persons dining together, location of dining, and other information. Asan example, persons may have provided this information to the datastore, such as using the internet to answer a questionnaire. Fields aremapped to the taxonomy, which may includeActor|Action|Process|Field|Value|Reference Key|Source Asset|DestinationAsset, here the Actor being the diner or the supermarket, Action being,for example, cost of delivery of an item, Process the process foractuating the item (order/payment/retrieval by diner), Field, Value, andReference Keys representing administrative values for the particularrow, and Source Asset and Destination Asset being the source product(tomatoes) and the destination asset (fully assembled dinner from adesired recipe at the Actor's chosen location on Sunday July 13). Again,these are examples and other values or fields may be provided, andtaxonomies may differ.

Following Normalization, Distribution comprises distributing thetaxonomies of information to one of the available Knowledge Nodes, whichthen stores the Row including the taxonomy of the tomatoes in theexample outlined above. The Supervisor node and/or at least oneKnowledge Node knows the location of the Row. Supervision entails theSupervisor node coordinating analyses, either in a scheduled or ad-hocfashion, such as analyzing the best food selection and the leastexpensive meal for the given diner given his or her criteria.Presentation entails the Supervisor node collating results so theresults may be presented in a manner appropriate for a specific user,such as an email to the supermarket and the diner, contemplating paymentand collection by the user of the ingredients from the particularsupermarket on a given day.

An alternate version of this “supermarket” situation in accordance withthe present design centers around the idea that a supermarket mayprovide a perceived infinite amount and combination of ingredients. Theproblem is how to provide meals given such an available supply. Theissue may be more broadly stated as, given a perceived infinite amountof food from a perceived infinite number of suppliers, how can meals beprovided? To solve such a problem, the present system undertakesNormalization, representing a standard form for all recipes, Analysiscomprising analyzing all provisions from all suppliers to select the“best” available ingredients for each recipe, and multiple analyses ofingredients (price, delivery time, quality, and so forth) to come upwith a perceived optimal combination based on circumstances. The presentsystem also includes Migration, delivering ingredients to the properlocation(s) so the meal can be created, and Supervision, supervisingpreparation of the meal in an appealing fashion.

Relevancy Engine

The concept of Relevancy in this context couples Conceptual and Semantic(fuzzy) Rules with a bottom up (as well as top-down), weighteddependency graph to isolate, prioritize, and quantify anomalous statesin a wide spectrum of business and technical lines of business. The ideaunderlying Relevancy is to combine general and domain-specific knowledgespecified by subject matter experts (in the form of rules) as well asrules discovered from data mining and real-time (or near real-time)event analysis. These rules absorb the current state of the “world” (thestate of the model) and, executing together, identify a wide spectrum ofanomalous (and often time-varying) behaviors. The combined degree ofanomaly generated by the rules affect the weight on the dependencygraph.

FIG. 7 provides a graphical overview of how the rules and the weightedrelevancy graph are connected in the present system. Elements 701-706represent elements of the rule based knowledge backbone, while elements751-756 define the flow of processing and control through the associatedgraph.

The two major functional components of the Relevancy Engine are theknowledge base system that houses the set of semantic rules and therelevancy graph itself that propagates the results of the rule executionso that a weighted effect assessment can be made at the top levels ofthe graph. These two functional components work more or lessindependently so that the relevancy knowledge can change withoutaffecting the mechanics of the graph. In this way, as an example, thesystem can create and explore multiple relevancy policies (rulepackages) with, in certain instances, different rules from conflictingor competing subject matter experts.

The Rule (Knowledge) Component includes elements 701 through 706 in FIG.7. Rule Package 701 is a structured assembly of rules. Rule Package 701allows a collection of rules to be organized into a logically orthematically organized container. Although not required, by default thename of the rule package may also be the name of relevancy graph.

Rule Set 702 is a collection of fuzzy rules. A Rule Set is associatedwith a node in the relevancy graph. The system executed rules in theRule Set as a group. The system normally executes the rules in order,but they can also be executed according to the logic of an inferenceengine (which establishes an ordering agenda based on the continuingstate of the system. An agenda acts as a probabilistic selectionmechanism in the same way as a Markov model acts as a probabilistic pathselector in a graph.)

Event Data 703 represents data from or for an event. The systeminitiates a Rule Set in response to the arrival of data. This data isusually (but not necessarily) in the form of events. Events aredescribed in a corresponding data model. That is, for event A, the datamodel defines the type of event, its organization, and thecharacteristic of each data element (field) in the event. Relevancyanalysis can use both events as a standalone object as well as theindividual fields contained in an event (that is, as an example, thecount of events on or during a particular time, or the inter-arrivaltime of events, or the clustering of events at a particular time, togive just a few examples, could be important pieces of knowledge,independent of the values of the individual fields.)

Rules are shown as rules 704. A basic rule is in the form: —when state—.The state in this instance is a fuzzy semantic proposition in thegeneral form data element name is [not] [hedges] semantic name. As anexample, consider the following simple rules:

when network_traffic is high

when network_traffic is very high

when network_traffic is not very high

when network_traffic is not somewhat very high

Rules can also contain structural discriminators that allow theknowledge engineer to fashion rules that deal with information insteadof data. As an example:

when emails_sent count(hour) by actor is low

Using this instruction, the system can examine the volume of emails sentby an Actor for each hour in the Actor's statistically determined workperiod to determine if the “emails sent” volume is low. The system canalso examine the intrinsic semantic nature of the data or informationvalues over the current data stream. For example:

when network_traffic is often very high

This rule measures the degree to which a percentage of the networktraffic values are considered very high. The system can base rules on apopulation of values that share a common semantic classification.

A relevancy rule generates a Compatibility Index (CIX), discussed above;this value is generated by an evaluation of the rule's premise state.This CIX measures the degree to which the premise is true (and is a realnumber in the range [0,1] inclusive). The system interprets this degreeof truth as “how compatible is the state to the cumulative averagecompatibility index from all the executed rules becomes the effectcontribution (EC) of the rule set. “Cumulative” means that the rules areexecuted as though they were connected by an intersection (and)operator. The system may store this EC value in the relevancy graph nodeassociated with the rule set.

Additional rule or graph capabilities associated with the relevancyengine may be provided. The Cognitive Modeler platform may also employvarious rules or instructions. The Cognitive Modeler has extensive rule,graphical, and fuzzy semantic capabilities forming a foundation foraspects of the relevancy engine.

Rule-to-Node Connector 705 connects rules to nodes. As previouslyindicated, each rule set is associated with a node in the relevancygraph. Once this connection is completed, the system initiates relevancyengine processing by looping through the Rule Package 701.

Nodes in the relevancy graph are of two types: processor and aggregator.A processor node has an associated rule set in the rule package. Anaggregator node is a parent to both processor as well as otheraggregator nodes. The aggregator “absorbs” effect contributions fromeach of its children.

The purpose of the effect analysis graph is simply to transport weightedeffect values from the nodes associated with rules (processors), upthrough collections of evaluation (aggregator) nodes, weighting eachaggregate effect value by a weight assigned to the edge. At the toplevel of the graph is a weighted effect for the semantic nature of thetop node sets.

Processor 751, or Relevancy Processor Node 751, has an effect assessmentcontribution set by the collection of rules in the associated rule set.The assessment contribution set is the average of the compatibilityindex values (known as effect contributions) for each of the rules.

Relevancy Edge 752 connects a node further down on the graph with a nodefurther up on the graph. The edge contains an assigned weight used tocompute the weighted average of the effects from all the node'schildren. Aggregator Node 753 absorbs all the effect values (e) from itschildren nodes. The weights of the incoming edges that connect eachchild node to its aggregator node are used to compute the weighted valuefor the aggregator node. The value is:

$c^{e} = \frac{\sum\limits_{i = 1}^{N}\; {w_{i} \times e_{i}}}{\sum\limits_{i = 1}^{N}\; w_{i}}$

where:

-   -   c^(e) the weighted combined effect    -   N the number of incoming edges    -   w_(i) the weight on the i^(th) incoming edge    -   e_(i) the effect from the i^(th) incoming node

This weighting is applied at each aggregator node to generate a weightedeffect contribution. Weights can be any value in the range [0,100].Since the effects are CIX values, the weighted value will fall in thesame range as CIX values [0,1]. From FIG. 7, the numbers in the cornersof the processor or aggregator boxes, such as 0.30 in the deltaprocessor box, represent estimated vales in that processor for aparticular item, such as a particular Actor or particular action. Thesevalues may be combined using further weightings and made availableand/or processed by the aggregator node.

Point 754 represents the Weighted Effect Value, where once a weightedvalue is calculated for a set of children nodes, the parent node passesthis weighted value up to its own parent. Point 755 is the TopAggregator Node. The movement up the graph continues from processors toaggregators until the upward flow reaches a node with no parent. This isthe relevancy state for the entire graph.

Results Signals 756 are the result of the relevancy engine, representingthe weighted effect value from the top node (or possibly the top set ofnodes). The result is generated in the form of one or more cognitivesignals.

Modeling the Statistical Behavior of Statistical Objects

The present design also includes modeling the statistical behavior ofstatistical objects. The system discovers, models, and predicts thebehavior of a resource by comparing its behavior to one or morecollections of statistical resources. A statistical resource (called aBehavior Cluster) captures the behaviors of a large numbers of actorsthat behave in roughly the same way over roughly the same time periods.The system may create Behavior Clusters through machine learningprocesses, e.g. fuzzy clustering and/or probabilistic graph theory.Behavior Clusters are used to predict the probable behavior of a newactor through the application of advanced machine intelligencecapabilities (fuzzy set theory, fuzzy similarity mapping, nonlinearregression, as well as linear and nonlinear correlation.)

For the modeling function, the following definitions are applicable.Resources, often called actors or (less commonly) affecters, include, asan example, such things as people, computing technology machines anddevices, networks, autonomous and connected machines, vehicles,aircraft, containers, railroads, ships, power plants, homes andbusinesses, drones, satellites, robots, space craft, and meters. Actorscan also include less physical entities such as a wide range ofelectronic documents, independent and connected computer systems, websites, autonomous intelligent agents, as well as unknown buyers and website visitors in areas such as ecommerce.

The Modeling Ecology is the environment in which the behavior occurs. Asan example, assume an Actor is Bill Smith. Bill Smith is a subscribedmember of Mega Gadget's online shopping store. When he visits the MegaGadget site, the Behavior Clusters to be accessed reside in the megagadget website ecology. When Bill Smith is roaming around in his smarthome, controlling the lights, thermostats, and so forth, the BehaviorCluster to be accessed resides in his resident smart home ecology. Thehouse itself is a member of similar houses in a wide autonomous smarthome ecology. When Bill Smith is in his driverless (autonomous) car, theBehavior Clusters to be accessed reside in his passenger smart carecology. The vehicle itself may reside in a wider autonomous smart carecology.

The Direction(s) of Actions Graphs are very similar to order ofoperations graphs used to model the time-ordered actions of a resourcein the Extreme Vigilance Behavior application. Both are dynamicallydiscovered Markov graphs. Where the order of operations graph develops aprobabilistic model of a single actor's actions by day of week and timeof day, the Direction of Actions Graph develops a probabilistic model ofthe actions of a very large number of actors (by time periods). Thus theedge probabilities reflect the probability that an Actor (either knownor unknown) will transition from node X to node Y (as an example, froman HTML page in site R to another HTML page in site R). The system usesthe evolution of paths in a Direction of Actions graph to identify thebehavior of an unknown Actor.

Behavior Clusters define the statistical actions of a very large numberof individual Actors over a time period. A cluster is multi-dimensional.That is, a cluster can discover and represent a behavior such as (inecommerce) page dwell-time by items selected by checkout value (indollars). A single behavior can be shared by multiple clusters (onecluster's dwell-time can be SHORT, another cluster's dwell-time can beMODERATE, and another cluster's dwell-time can be LONG, for example).Because clusters use qualitative semantics (fuzzy logic) to encode theirbehaviors, clusters do not have unique behaviors. That is, some range ofSHORT values will overlap some range of MODERATE values (but todifferent complementary degrees). Hence a resource can belong to (bemapped to) multiple Behavior Clusters. The system can then model thereal-world situation where actor behaviors are imprecise, roughlyrepeated, and sometimes ambiguous. This nonexclusive property also hasimplications for the precision and accuracy of the real-world modelingof a wide spectrum of Actors.

With respect to specific modeling and prediction in the present design,behavior analysis calls for discovering, quantifying, and modeling thebehavior of a collection of individual Actors that share, to somedegree, a common set of features. These features may be considered thesemantics of the behavior. Rather than following the behavior of anindividual Actor (a person, a server, a machine, and so forth), thesystem considers a cluster of Actors that share the same behaviors asthe Actor. Hence (as a few examples) in e-commerce, swarms of machines(such as smart energy meters), advertising campaigns, tactical andstrategic troop and weapons deployments, and so forth, individual Actorsare replaced by their surrogate, the cluster of buyers, meters,television viewers, and combat infantry battalions that share a commonbehavior over a common set of features or properties. We call this thestatistical modeling of statistical objects.

Assigning an individual Actor to one or more Behavior Clusters allowsthe system to predict the behavior of an Actor relative to its peers (oridentify and quantify unusual behaviors for both known as well asunknown Actors).

For known Actors (such as returning shoppers to Mega Gadget) we can findtheir behaviors in the shopping ecology through their sign-on identity.

For unknown (anonymous) actors, the system needs to map their paththrough the web site while continuously matching this unfolding path tothe statistical paths (in the Direction of Actions graphs) stored in anyof the ecologies. Unknown Actors that do not complete a well-definedpath create a partial path which is also stored in the ecology (partialpaths are important, they often isolate roadblocks, as an example, toeffective shopping or conversion).

Thus in general, the present design may be considered as follows. In afirst aspect, the design includes a system for measuring valuesqualitatively across multiple dimensions (terrain) using cognitivecomputing techniques. Event reception components, for each field elementin each event in a stream, comprise or perform the following functionsor functional components: threshold application, terrain updater,outlier analysis module, threshold violation predictor, time-orderedbehavior evaluator, and graph updater.

In a second aspect, the design includes a system for detecting andadjusting qualitative contexts across multiple dimensions for multipleactors with cognitive computing techniques. Periodic executioncomponents operate over full or partial sets of received data, includinga peer to peer analyzer, an actor correlation analyzer, an actorbehavior analyzer, a rate of change predictor, a semantic rule analyzer,and a plurality of signal managers.

In a third aspect, there is provided a system for “defuzzification” ofmultiple qualitative signals into human-centric threat notificationsusing cognitive computing techniques. The design may include signalgeneration components, a system which evolves understanding of anomaliesand risks into a human-facing threat indication(s). The system may bebased on the described components that detects and signals anomalies.The system generates an anomaly signal by detecting behavior that isinconsistent with normal behavior. An anomaly has two importantproperties: its severity and its degree of inconsistency (ormathematical distance) from normal behavior. The severity is a class ofanomaly (warning, caution, alert, and severe) assigned by the analyticalcomponent that detected and measured the anomaly. Alternatively oradditionally, the system may be based on the components described hereinthat detect and signal hazards. A hazard is an unperfected threatassociated with an actor without regard to any related assets or thebehavior of other actors. It represents the risk to an enterprise basedsolely on cumulative multi-dimensional behaviors (that is, anomalousstates generated from thresholds, orders of operation (AOO),peer-to-peer similarity, and the actor's behavior change over time). AHazard may have two important properties: severity and weighted risk. Ahazard can have a severity of medium, elevated, or high. The severity isnot assigned by the system, but is derived by the system from thecollection of inherent terrain risks.

Additionally or alternately, the system may include the componentsdescribed herein and may detect and signal threats. A threat may be a“perfected” threat that ties together actors with the behaviors of otheractors as well as the assets used by all the actors in the hazardcollection. Threats may develop a sequence of operations over a dynamictime-frame. Threats may be connected in a heterogeneous graph wherenodes can be actors or assets and the edges define the frame as well asthe strength of their connection as a function of risk. The system mayproduce incidents that may be reviewed by reviewers.

Additionally or alternately, the system may dispatch cognitive computingacross multiple workers. Load distribution components may include asupervisory node dispatching work to a plurality of interconnectedknowledge nodes called compute servers. The supervisory node maycategorize and scale performance of multiple disjoint algorithms acrossa seemingly infinite actor population, normalize data using a commontaxonomy, distribute normalized data relatively evenly across theplurality of knowledge nodes, supervise algorithm execution acrossknowledge nodes, and collate and present results of analysis of theseemingly infinite actor population.

At least one knowledge node may comprise a relevancy engine.Construction of such a relevancy engine may include a rule package and aseries of processors organized in a tree arrangement configured toperform functions according to the rule set wherein the topmostprocessor in the tree structure provides results of analysis using fuzzylogic and the rule package organizes a series of rule sets, each ruleset corresponding to a different one of the series of processors.

Thus according to one aspect of the present design, there is provided acognitive system process executable on a computing device, comprisingreceiving a set of actors and associated actor information, receivingassets and their associated asset information, creating data dictionaryentries for at least one taxonomy based on the set of actors and theassets, creating at least one cognitive model using the data dictionaryentries for a time period, computing trust of the cognitive model as afuzzy number, activating the cognitive model if trust of the cognitivemodel is above a cognitive model trust threshold, when the cognitivemodel is activated, scheduling a collection of tasks to run that performregular extraction of actions from an original data source andperforming at least one anomaly analysis associated with the cognitivemodel, for selected data dictionary entries, normalizing associatedactor actions by converting at least one event to data dictionaryformat, inserting at least one normalized terrain entry into thecognitive model, and updating the cognitive model.

According to a further aspect of the present design, there is provided asystem for performing cognitive modeling comprising a plurality ofcomponents repeated for each field in an event configured to receivedata elements, the plurality of components comprising a thresholdapplication component, a terrain updater, an outlier analysis module, athreshold violation predictor, a time-ordered behavior evaluator, and agraph updater, a periodic set of components configured to operateperiodically, comprising a peer to peer analyzer, an actor correlationanalyzer, an actor behavior analyzer, a rate of change predictor, and asemantic rule analyzer, and a plurality of signal managers. Theplurality of components and the periodic set of components areconfigured to interface with a threat detector.

According to another aspect of the present design, there is provided aseries of interconnected compute servers comprising a supervisory nodeand a plurality of knowledge nodes, wherein the series of computeservers are configured to categorize and scale performance of multipledisjoint algorithms across a seemingly infinite actor population,wherein the series of interconnected compute servers are configured tonormalize data using a common taxonomy, distribute normalized datarelatively evenly across the plurality of knowledge nodes, supervisealgorithm execution across knowledge nodes, and collate and presentresults of analysis of the seemingly infinite actor population. At leastone knowledge node comprises a relevancy engine comprising a rulepackage and a series of processors organized in a tree arrangement andconfigured to perform functions according to the rule set; wherein thetopmost processor in the tree structure provides results of analysisusing fuzzy logic, wherein the rule package comprises set comprises aseries of rule sets, each rule set corresponding to a different one ofthe series of processors.

According to a further aspect of the present design, there is provided acognitive system comprising a receiver configured to receive a set ofactors and associated actor information and receive assets and theirassociated asset information, a creation apparatus configured to createdata dictionary entries for a taxonomy based on the set of actors andthe assets and create a cognitive model using the data dictionaryentries for a time period, and a computing apparatus configured tocompute trust of the cognitive model as a fuzzy number and activate thecognitive model if trust of the cognitive model is above a cognitivemodel trust threshold. When the cognitive model is activated, thecognitive modeling system is configured to schedule a collection oftasks to run that perform regular extraction of actions from an originaldata source and perform at least one anomaly analysis associated withthe cognitive model. For selected data dictionary entries, the cognitivemodeling system is configured to normalize associated actor actions byconverting at least one event to data dictionary format, insert at leastone normalized terrain entry into the cognitive model, and update thecognitive model.

According to a further aspect of the present design, there is provided acognitive modeling system comprising a receiver configured to receive aset of actors, associated actor information, assets, and associatedasset information, a creation apparatus configured to create datadictionary entries for a taxonomy based on the set of actors and theassets and create a cognitive model using the data dictionary entriesapplicable to a time period, and a computing apparatus configured tocompute trust of the cognitive model as a fuzzy number and activate thecognitive model if trust of the cognitive model is above a cognitivemodel trust threshold.

According to a further aspect of the present design, there is provided acognitive modeling method comprising receiving at a hardware computingarrangement, comprising a number of processor nodes and a number ofaggregator nodes, a set of actors, associated actor information, assets,and associated asset information, creating data dictionary entries for ataxonomy based on the set of actors and the assets, creating a cognitivemodel using the data dictionary entries for a time period, computingtrust of the cognitive model as a fuzzy number, activating the cognitivemodel if trust of the cognitive model is above a cognitive model trustthreshold, when the cognitive model is activated, scheduling acollection of tasks to run that perform regular extraction of actionsfrom an original data source and perform at least one anomaly analysisassociated with the cognitive model. For selected data dictionaryentries, the method further comprises normalizing associated actoractions by converting at least one event to data dictionary format,inserting at least one normalized terrain entry into the cognitivemodel, and updating the cognitive model.

According to a further aspect of the present design, there is provided asystem for performing cognitive modeling, comprising an event acquirerconfigured to acquire an event comprising an associated date and set ofdata fields, an analyzer element comprising a plurality of componentsrepeated for each field in an event received from the event acquirer,wherein the analyzer element applies thresholds to each event,determines outliers, evaluates time-ordered behavior, and predictsthreshold violations for the event, a periodic set of componentsconfigured to operate periodically on demand, the periodic set ofcomponents configured to perform peer to peer analysis, actorcorrelation analysis, actor behavior analysis, semantic rule analysis,and predict rates of change, and a plurality of signal managersinterfacing with the analyzer element and the periodic set of componentsconfigured to exclude signals based on content properties of datatransmitted. The plurality of components and the periodic set ofcomponents are configured to interface with a threat detector.

According to a further aspect of the present design, there is provided asystem for performing cognitive modeling, comprising an event acquirerconfigured to acquire an event comprising an associated date and set ofdata fields, an analyzer element comprising a plurality of componentsrepeated for each field in an event received from the event acquirer, aperiodic set of components configured to operate periodically on demandto analyze and predict based on information received from the analyzerelement, and a plurality of signal managers interfacing with theanalyzer element and the periodic set of components, wherein theperiodic set of components is configured to exclude signals based oncontent properties of data transmitted. The plurality of components andthe periodic set of components are configured to interface with a threatdetector.

According to a further aspect of the present design, there is provided acognitive modeling apparatus, comprising an event acquirer, an updatingand evaluating arrangement comprising hardware configured to applythresholds, update event related data, predict thresholds and determineoutliers from events received from the event acquirer, and a periodic/ondemand apparatus configured to analyze event data on demand, and aseries of signal managers comprising a first signal manager connected tothe updating and evaluation arrangement and a second signal managerconnected to the periodic/on demand apparatus. The series of signalmanagers are configured to exclude signals based on content properties.

According to a further aspect of the present design, there is provided asystem comprising a series of interconnected compute servers comprisinga supervisory hardware node and a plurality of knowledge hardware nodes,wherein the series of interconnected compute servers are configured tocategorize and scale performance of multiple disjoint algorithms acrossa seemingly infinite actor population, wherein the series ofinterconnected compute servers are configured to normalize data using acommon taxonomy, distribute normalized data relatively evenly across theplurality of knowledge hardware nodes, supervise algorithm executionacross knowledge hardware nodes, and collate and present results ofanalysis of the seemingly infinite actor population.

According to a further aspect of the present design, there is provided asystem comprising a knowledge base system comprising a rule packagecomprising a plurality of rule sets, and a relevancy processorarrangement comprising a series of processors organized in a treearrangement and configured to perform functions according to at leastone of the plurality of rule sets contained in the rule package, whereina topmost processor in the tree structure provides results of analysisusing fuzzy logic, wherein the relevancy processor arrangement comprisesa series of interconnected compute servers comprising aggregatorhardware nodes and processor hardware nodes, wherein the series ofinterconnected compute servers are configured to categorize and scaleperformance of multiple disjoint algorithms across a seemingly infiniteactor population. The series of interconnected compute servers areconfigured to normalize data using a common taxonomy.

According to a further aspect of the present design, there is provided asystem comprising a knowledge base hardware system comprising a rulepackage comprising a plurality of rule sets configured to receive rulesfrom a rule device and events from an events device and a relevancyprocessor arrangement comprising a series of processors organized in atree arrangement and configured to perform functions according to atleast one of the plurality of rule sets contained in the rule packageusing fuzzy logic, wherein the relevancy processor arrangement comprisesa series of interconnected compute servers comprising aggregatorhardware nodes and processor hardware nodes. The series ofinterconnected compute servers are configured to categorize and scaleperformance of multiple disjoint algorithms across a seemingly infiniteactor population.

According to a further aspect of the present design, there is provided asystem for measuring values qualitatively across a terrain comprisingmultiple dimensions using cognitive computing techniques, comprising aplurality of event reception components configured to operate on eachevent in a stream relevant to the terrain, the plurality of eventreception components comprising a threshold application componentconfigured to apply a threshold to each element in the stream, a terrainupdater configured to update the terrain based on at least one event, anoutlier analysis module configured to determine any outlier in thestream of events, a threshold violation predictor configured to predictthreshold violations based on the stream of events, a time-orderedbehavior evaluator configured to evaluate behavior based on the streamof events, and a graph updater to update a graph based on the stream ofevents.

According to a further aspect of the present design, there is provided asystem for measuring values qualitatively across a terrain comprisingmultiple dimensions using cognitive computing techniques, comprising aplurality of event reception components configured to operate on eachevent in a stream relevant to the terrain, a periodic set of componentsconfigured to operate periodically on demand to analyze and predictbased on information received from the plurality of event receptioncomponents, and a plurality of signal managers interfacing with theplurality of event reception components and the periodic set ofcomponents, wherein the plurality of signal managers is configured toexclude signals based on content properties of data transmitted.

According to a further aspect of the present design, there is provided amethod for measuring values qualitatively across a terrain comprisingmultiple dimensions using cognitive computing techniques, comprisingoperating on each event in a stream relevant to the terrain using aplurality of event reception components, operating on demand to analyzeand predict based on information received from the plurality of eventreception components using a periodic set of components, and excludingselected signals based on content properties of data transmitted.

According to a further aspect of the present design, there is provided asystem for detecting and adjusting qualitative contexts across multipledimensions for multiple actors with cognitive computing techniquescomprising a series of periodic execution components configured tooperate over full or partial sets of received data, the series ofperiodic components comprising a peer to peer analyzer configured todetect anomalous behaviors among work-specific peer actors sharingsimilar types tasks, an actor behavior analyzer configured to examinechange in an actor's behavior over time by comparing the similarity ofpast behavior and current behavior, a rate of change predictorconfigured to study changes in behavior over time for peer to peerperformance according to the peer to peer analyzer, actor behaviorchange according to the actor behavior analyzer, and actor correlationanalysis, and a semantic rule analyzer configured to encode conditional,provisional, cognitive, operational, and functional knowledge, and aplurality of signal managers configured to exclude signals based oncontent properties of data transmitted.

According to a further aspect of the present design, there is provided asystem for detecting and adjusting qualitative contexts across multipledimensions for multiple actors with cognitive computing techniquescomprising a series of periodic execution components configured tooperate over full or partial sets of received data, the series ofperiodic components comprising a peer to peer analyzer configured todetect anomalous behaviors among work-specific peer actors sharingsimilar types tasks and an actor behavior analyzer configured to examinechange in an actor's behavior over time by comparing the similarity ofpast behavior and current behavior, and a plurality of event receptioncomponents configured to operate on each event in a stream relevant to aterrain and selectively provide information to the series of periodicexecution components.

According to a further aspect of the present design, there is provided amethod for detecting and adjusting qualitative contexts across multipledimensions for multiple actors with cognitive computing techniquescomprising detecting anomalous behaviors among work-specific peer actorssharing similar types tasks using a peer to peer analyzer, examiningchange in an actor's behavior over time by comparing the similarity ofpast behavior and current behavior using an actor behavior analyzer,studying changes in behavior over time for peer to peer performanceaccording to the peer to peer analyzer, actor behavior change accordingto the actor behavior analyzer, and actor correlation analysis using arate of change predictor; and encoding conditional, provisional,cognitive, operational, and functional knowledge using a semantic ruleanalyzer.

According to a further aspect of the present design, there is provided asystem for defuzzification of multiple qualitative signals intohuman-centric threat notifications using cognitive computing techniques,comprising a series of periodic execution components configured tooperate over full or partial sets of received data, and a plurality ofevent reception components configured to operate on each event in astream relevant to a terrain and selectively provide information to theseries of periodic execution components. The system detects and signalsanomalies upon detecting behavior inconsistent with normal behavior.

According to a further aspect of the present design, there is provided asystem comprising a series of periodic execution components configuredto operate over full or partial sets of received data, a plurality ofevent reception components configured to operate on each event in astream relevant to a terrain and selectively provide information to theseries of periodic execution components, and a signal manager configuredto manage signals for one of the series of periodic execution componentsand the plurality of event reception components. The system performsdefuzzification of multiple qualitative signals into human-centricthreat notifications using cognitive computing techniques by detectingand signaling anomalies upon detecting behavior inconsistent with normalbehavior.

According to a further aspect of the present design, there is provided asystem, comprising a series of periodic execution components configuredto operate over full or partial sets of received data, a plurality ofevent reception components configured to operate on each event in astream relevant to a terrain and selectively provide information to theseries of periodic execution components, and a threat detectorconfigured to detect threats based on information provided by the seriesof periodic execution components and the plurality of event receptioncomponents. The system performs defuzzification of multiple qualitativesignals into human-centric threat notifications using cognitivecomputing techniques by detecting and signaling anomalies upon detectingbehavior inconsistent with normal behavior.

According to a further aspect of the present design, there is provided asystem for dispatching cognitive computing across multiple workers,comprising a supervisory node configured to dispatch work to a pluralityof compute servers forming interconnected knowledge nodes and arelevancy engine provided in at least one knowledge node. The relevancyengine comprises a rule package and a series of processors organized ina tree arrangement, the series of processors configured to performfunctions according to the rule package.

According to a further aspect of the present design, there is provided asystem for dispatching cognitive computing across multiple workers,comprising a plurality of compute servers forming interconnectedknowledge nodes, wherein one of the interconnected supervisory nodescomprises a supervisory node configured to dispatch work to at least oneknowledge node. One knowledge node comprises a relevancy enginecomprising a series of processors organized in a tree arrangement, theseries of processors configured to perform functions according to a ruleset.

According to a further aspect of the present design, there is provided asystem for dispatching cognitive computing across multiple workers,comprising a plurality of compute servers configured to form asupervisory node configured to dispatch work and a plurality ofinterconnected knowledge nodes, with a relevancy engine provided in atleast one interconnected knowledge node. The relevancy engine comprisesa series of processors organized in a tree arrangement, the series ofprocessors configured to perform functions according to a rule package.

In one or more exemplary designs, the functions described may beimplemented in hardware, software, firmware, or any combination thereof.If implemented in software, the functions may be stored on ortransmitted over as one or more instructions or code on acomputer-readable medium. Computer-readable media includes both computerstorage media and communication media including any medium thatfacilitates transfer of a computer program from one place to another,i.e. may include transitory and/or non-transitory computer readablemedia. A storage media may be any available media that can be accessedby a computer. By way of example, and not limitation, suchcomputer-readable media can comprise RAM, ROM, EEPROM, CD-ROM or otheroptical disk storage, magnetic disk storage or other magnetic storagedevices, or any other medium that can be used to carry or store desiredprogram code in the form of instructions or data structures and that canbe accessed by a computer. Also, any connection is properly termed acomputer-readable medium. For example, if the software is transmittedfrom a website, server, or other remote source using a coaxial cable,fiber optic cable, twisted pair, digital subscriber line (DSL), orwireless technologies such as infrared, radio, and microwave, then thecoaxial cable, fiber optic cable, twisted pair, DSL, or wirelesstechnologies such as infrared, radio, and microwave are included in thedefinition of medium. Disk and disc, as used herein, includes compactdisc (CD), laser disc, optical disc, digital versatile disc (DVD),floppy disk and blu-ray disc where disks usually reproduce datamagnetically, while discs reproduce data optically with lasers.Combinations of the above should also be included within the scope ofcomputer-readable media.

The foregoing description of specific embodiments reveals the generalnature of the disclosure sufficiently that others can, by applyingcurrent knowledge, readily modify and/or adapt the system and method forvarious applications without departing from the general concept.Therefore, such adaptations and modifications are within the meaning andrange of equivalents of the disclosed embodiments. The phraseology orterminology employed herein is for the purpose of description and not oflimitation.

What is claimed is:
 1. A system for performing cognitive modeling,comprising: an event acquirer configured to acquire an event comprisingan associated date and set of data fields; an analyzer elementcomprising a plurality of components repeated for each field in an eventreceived from the event acquirer, wherein the analyzer element appliesthresholds to each event, determines outliers, evaluates time-orderedbehavior, and predicts threshold violations for the event; a periodicset of components configured to operate periodically on demand, theperiodic set of components configured to perform peer to peer analysis,actor correlation analysis, actor behavior analysis, semantic ruleanalysis, and predict rates of change; and a plurality of signalmanagers interfacing with the analyzer element and the periodic set ofcomponents configured to exclude signals based on content properties ofdata transmitted; wherein the plurality of components and the periodicset of components are configured to interface with a threat detector. 2.The system of claim 1, wherein the analyzer element comprises athreshold application component, a terrain updater, an outlier analysismodule, a threshold violation predictor, a time-ordered behaviorevaluator, and a graph updater.
 3. The system of claim 1, wherein theperiodic set of components comprises a peer to peer analyzer, an actorcorrelation analyzer, an actor behavior analyzer, a rate of changepredictor, and a semantic rule analyzer.
 4. The system of claim 1,wherein a first signal manager is provided with the analyzer element anda second signal manager is provided with the periodic set of components,and the first signal manger and the second signal manager interface witha signal filter configured to receive signal weights from a signalweight repository.
 5. The system of claim 1 wherein the threat detectoremploys an extreme vigilance application programming interface.
 6. Thesystem of claim 1, wherein the outlier analyzer determines whether ornot frequency, periodicity, and general value of outliers implies thestart of another data pattern or a change to an existing data pattern.7. The system of claim 2, wherein time-ordered behavior evaluatoremploys a Markov Graph, to learn a general sequence of events performedby one actor during a time period.
 8. The system of claim 3, wherein theactor behavior analyzer is configured to examine any changes in actorbehavior over time by comparing similarity of past behavior with currentbehavior.
 9. The system of claim 3, wherein the semantic rule analyzeris configured to encode conditional, provisional, cognitive,operational, and functional knowledge.
 10. The system of claim 3,wherein the rate of change predictor is configured to store similarityand correlation results over time and rates of change over time.
 11. Asystem for performing cognitive modeling, comprising: an event acquirerconfigured to acquire an event comprising an associated date and set ofdata fields; an analyzer element comprising a plurality of componentsrepeated for each field in an event received from the event acquirer; aperiodic set of components configured to operate periodically on demandto analyze and predict based on information received from the analyzerelement; and a plurality of signal managers interfacing with theanalyzer element and the periodic set of components, wherein theperiodic set of components is configured to exclude signals based oncontent properties of data transmitted; wherein the plurality ofcomponents and the periodic set of components are configured tointerface with a threat detector.
 12. The system of claim 11, whereinthe analyzer element comprises a threshold application component, aterrain updater, an outlier analysis module, a threshold violationpredictor, a time-ordered behavior evaluator, and a graph updater. 13.The system of claim 11, wherein the periodic set of components comprisesa peer to peer analyzer, an actor correlation analyzer, an actorbehavior analyzer, a rate of change predictor, and a semantic ruleanalyzer.
 14. The system of claim 11, wherein a first signal manager isprovided with the analyzer element and a second signal manager isprovided with the periodic set of components, and the first signalmanger and the second signal manager interface with a signal filterconfigured to receive signal weights from a signal weight repository.15. The system of claim 11 wherein the threat detector employs anextreme vigilance application programming interface.
 16. The system ofclaim 11, wherein the outlier analyzer determines whether or notfrequency, periodicity, and general value of outliers implies the startof another data pattern or a change to an existing data pattern.
 17. Thesystem of claim 12, wherein time-ordered behavior evaluator employs aMarkov Graph, to learn a general sequence of events performed by oneactor during a time period.
 18. The system of claim 13, wherein theactor behavior analyzer is configured to examine any changes in actorbehavior over time by comparing similarity of past behavior with currentbehavior.
 19. A cognitive modeling apparatus, comprising: an eventacquirer; an updating and evaluating arrangement comprising hardwareconfigured to apply thresholds, update event related data, predictthresholds and determine outliers from events received from the eventacquirer; and a periodic/on demand apparatus configured to analyze eventdata on demand; and a series of signal managers comprising a firstsignal manager connected to the updating and evaluation arrangement anda second signal manager connected to the periodic/on demand apparatus;wherein the series of signal managers are configured to exclude signalsbased on content properties.
 20. The cognitive modeling apparatus ofclaim 19, further comprising a threat detector configured to detectthreats and a cognitive modeling vigilance apparatus.